Finding the Shortest Distance in R

I am working with the R programming language.

I generated the following random data set that contains x and y points:

set.seed(123)

x_cor = rnorm(10,100,100)
y_cor = rnorm(10,100,100)

my_data = data.frame(x_cor,y_cor)

       x_cor     y_cor
1   43.95244 222.40818
2   76.98225 135.98138
3  255.87083 140.07715
4  107.05084 111.06827
5  112.92877  44.41589
6  271.50650 278.69131
7  146.09162 149.78505
8  -26.50612 -96.66172
9   31.31471 170.13559
10  55.43380  52.72086

I am trying to write a "greedy search" algorithm that shows which point is located the "shortest distance" from some starting point.

For example, suppose we start at -26.50612, -96.66172

distance <- function(x1,x2, y1,y2) {
  dist <- sqrt((x1-x2)^2 + (y1 - y2)^2)
  return(dist)
}

Then I calculated the distance between -26.50612, -96.66172 and each point :

results <- list()

for (i in 1:10){


distance_i <- distance(-26.50612, my_data[i,1], -96.66172, my_data[i,2]  )
index = i

my_data_i = data.frame(distance_i, index)

 results[[i]] <- my_data_i

}

results_df <- data.frame(do.call(rbind.data.frame, results))

However, I don't think this is working because the distance between the starting point -26.50612, -96.66172 and itself is not 0 (see 8th row):

  distance_i index
1    264.6443     1
2    238.7042     2
3    191.3048     3
4    185.0577     4
5    151.7506     5
6    306.4785     6
7    331.2483     7
8    223.3056     8
9    213.3817     9
10   331.6455    10

My Question:

  • Can someone please show me how to write a function that correctly finds the nearest point from an initial point
  • (Step 1) Then removes the nearest point and the initial point from "my_data"
  • (Step 2) And then re-calculates the nearest point from "my_data" using the nearest point identified Step 1 (i.e. with the removed data)
  • And in the end, shows the path that was taken (e.g. 5,7,1,9,3, etc)

Can someone please show me how to do this?

Thanks!

I do not get your results when I run your code. The 8th point is very nearly zero.
I also included an equivalent calculation with no for loop.

set.seed(123)
 
x_cor = rnorm(10,100,100)
y_cor = rnorm(10,100,100)

my_data = data.frame(x_cor,y_cor)
my_data
       x_cor     y_cor
1   43.95244 222.40818
2   76.98225 135.98138
3  255.87083 140.07715
4  107.05084 111.06827
5  112.92877  44.41589
6  271.50650 278.69131
7  146.09162 149.78505
8  -26.50612 -96.66172
9   31.31471 170.13559
10  55.43380  52.72086
distance <- function(x1,x2, y1,y2) {
   dist <- sqrt((x1-x2)^2 + (y1 - y2)^2)
   return(dist)
 }
 
results <- list()
for (i in 1:10){
   
   distance_i <- distance(-26.50612, my_data[i,1], -96.66172, my_data[i,2]  )
   index = i
   
   my_data_i = data.frame(distance_i, index)
   
   results[[i]] <- my_data_i
 }
 
results_df <- data.frame(do.call(rbind.data.frame, results))
results_df
     distance_i index
1  3.267568e+02     1
2  2.546226e+02     2
3  3.684861e+02     3
4  2.469599e+02     4
5  1.983557e+02     5
6  4.792718e+02     6
7  3.008754e+02     7
8  5.548514e-06     8
9  2.729909e+02     9
10 1.703799e+02    10
 
#No for loop
results_vec <- distance(-26.50612,my_data$x_cor,-96.66172,my_data$y_cor)
results_df2 <- data.frame(distance_i = results_vec,
                           index = 1:10)
results_df2
     distance_i index
1  3.267568e+02     1
2  2.546226e+02     2
3  3.684861e+02     3
4  2.469599e+02     4
5  1.983557e+02     5
6  4.792718e+02     6
7  3.008754e+02     7
8  5.548514e-06     8
9  2.729909e+02     9
10 1.703799e+02    10
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.