st_distance between points and lines identified by a common id

HI,

I am loking for calcultating euclidian distance between 2 sf files in R:

  • The first one (TRANSECTS) is a network of 8 transects identified by the column 'ID'
  • The second one (OBSERVATIONS) are observations made by walking along the 8 transects from the first file. Each observation has a column "Trans" where is reported the ID transect, matching IDs from the table "TRANSECTS".

I would like to calculate the euclidian distance between each observation from the table OBSERVATIONS and the specific line transect where the record has been done from the table TRANSECTS.

I tried st_distance(OBSERVATIONS, TRANSECTS") but it provides me with a matrice whithout filtering by id.

Could you help me please?

Thanks a lot.

JB

The sf::st_distance() is a good way to calculate distances; since you mention specifically euclidean distance pay attention to CRSes of your spatial objects. Euclidean distance is reported for planar CRSes (those that have easings and northings, typically in meters) while geodesic distance is for geometric CRSes (those that are in degrees on a sphere).

A good practice is to give names to your distance matrix object; it will make it both more visually compelling (not that big a deal) and easier to work with via subsets operator. Which I found kind of neat personally...

Since your example is not easily reproducible allow me to use my own, built on 3 semi random North Carolina cities (because I am deeply in love with the NC shapefile that ships with {sf} package :wink: )

library(sf)
library(dplyr)

# 3 semi rancom cities in NC (because I *deeply love* the nc.shp file)
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
                     x = c(-78.633333, -79.819444, -77.912222),
                     y = c(35.766667, 36.08, 34.223333)) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326)

# prepare a distance matrix
mtx <- st_distance(cities)

# give the rows & cols meaningful names
colnames(mtx) <- cities$name
rownames(mtx) <- cities$name  

# see my work, and find it good...
mtx
# Units: [m]
#             Raleigh Greensboro Wilmington
# Raleigh         0.0   112342.9   183751.2
# Greensboro 112342.9        0.0   269595.9
# Wilmington 183751.2   269595.9        0.0

# use subset to find a specific distance pair by names
mtx["Greensboro", "Wilmington"]
# 269595.9 [m]

Hi,

Thanks a lot for your answer.

Your example is very clear and useful. However, in my case, I need to get a matrix between two different spatial files and then to get the distance for each observation in OBSERVATIONS data frame, where OBSERVATIONS.TRANSECT=TRANSECT.ID.

I achieved my goal through the following command, but it must have an easier way to do...

Thanks if ever you have a better solution.

library(sf)
library(dplyr)
library(ggplot2)

#TRANSECTS IMPORT#

TRANSECTS <- st_read(dsn = 'C:/Users/USER/Documents/R/Distance_sampling',layer = '2018_12_05_IKAV_L93')
TRANSECTS<-st_set_crs(TRANSECTS,2154)
TRANSECTS<-arrange(TRANSECTS, id) 

#OBSERVATIONS IMPORT#

DATA1<-read.csv(file='C:/Users/USER/Documents/R/Distance_sampling/IKV.csv',sep = ",", dec = ".")
OBSERVATIONS<-st_as_sf(DATA1,coords =c("X_CHEV","Y_CHEV") )
OBSERVATIONS<-st_set_crs(OBSERVATIONS,2154)

#ST_DISTANCE CALCULATION#

D<-st_distance(OBSERVATIONS, TRANSECTS)
head(D)
Units: [m]
           [,1]       [,2]     [,3]     [,4]     [,5]     [,6]     [,7]      [,8]
[1,]  231.25374 5262.85379 6109.051 9781.463 3920.827 9121.063 8586.745 11699.683
[2,]  254.69284 5327.11137 6382.810 9921.521 4214.769 8636.904 8020.288 11520.814
[3,]  226.47543 5361.76793 6442.659 9961.020 4281.754 8587.226 7956.365 11508.191
[4,]   75.42924 5134.72507 6198.209 9729.977 4034.177 8478.314 7875.684 11335.406
[5,] 1181.60916   25.19995 2306.999 4586.114 1899.290 4044.081 4150.167  5801.358
[6,] 2376.53730   27.98066 2346.578 3480.811 2776.990 2691.604 3815.701  4219.291


DATA<-bind_cols(OBSERVATIONS, D)
names(DATA)[40:47]=c(1:8) #rename with my transects names

A<-DATA%>%
  filter(Transect==1)%>%
  select(-c(41:48))
names(A)[40]="DIST"

B<-DATA%>%
  filter(Transect==2)%>%
  select(-c(40,42:48))
names(B)[40]="DIST"

C<-DATA%>%
  filter(Transect==3)%>%
  select(-c(40:41,43:48))
names(C)[40]="DIST"

D<-DATA%>%
  filter(Transect==4)%>%
  select(-c(40:42,44:48))
names(D)[40]="DIST"

E<-DATA%>%
  filter(Transect==5)%>%
  select(-c(40:43,45:48))
names(E)[40]="DIST"

F<-DATA%>%
  filter(Transect==6)%>%
  select(-c(40:44,46:48))
names(F)[40]="DIST"

G<-DATA%>%
  filter(Transect==7)%>%
  select(-c(40:45,47:48))
names(G)[40]="DIST"

H<-DATA%>%
  filter(Transect==8)%>%
  select(-c(40:46))
names(H)[40]="DIST"

DATA2<-bind_rows(A,B,C,D,E,F,G,H)

select(DATA2,40)

Simple feature collection with 75 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 920310.1 ymin: 6614858 xmax: 935200.5 ymax: 6630276
Projected CRS: RGF93 / Lambert-93
First 10 features:
        DIST                 geometry
1  231.25374 POINT (920755.7 6626321)
2  254.69284 POINT (920422.1 6625473)
3  226.47543 POINT (920362.9 6625361)
4   75.42924 POINT (920606.8 6625418)
5   41.56977   POINT (924198 6622891)
6  125.53985 POINT (920795.7 6623698)
7  153.00790 POINT (921983.6 6624313)
8  126.24612 POINT (924282.2 6621614)
9   64.08143 POINT (920310.1 6626538)
10  66.53761 POINT (920563.8 6625667)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.