I am working with the R programming language.
I generated the following random data set that contains x and y points:
set.seed(123)
x_cor = rnorm(10,100,100)
y_cor = rnorm(10,100,100)
my_data = data.frame(x_cor,y_cor)
x_cor y_cor
1 43.95244 222.40818
2 76.98225 135.98138
3 255.87083 140.07715
4 107.05084 111.06827
5 112.92877 44.41589
6 271.50650 278.69131
7 146.09162 149.78505
8 -26.50612 -96.66172
9 31.31471 170.13559
10 55.43380 52.72086
I am trying to write a "greedy search" algorithm that shows which point is located the "shortest distance" from some starting point.
For example, suppose we start at -26.50612, -96.66172
distance <- function(x1,x2, y1,y2) {
dist <- sqrt((x1-x2)^2 + (y1 - y2)^2)
return(dist)
}
Then I calculated the distance between -26.50612, -96.66172
and each point :
results <- list()
for (i in 1:10){
distance_i <- distance(-26.50612, my_data[i,1], -96.66172, my_data[i,2] )
index = i
my_data_i = data.frame(distance_i, index)
results[[i]] <- my_data_i
}
results_df <- data.frame(do.call(rbind.data.frame, results))
However, I don't think this is working because the distance between the starting point -26.50612, -96.66172
and itself is not 0 (see 8th row):
distance_i index
1 264.6443 1
2 238.7042 2
3 191.3048 3
4 185.0577 4
5 151.7506 5
6 306.4785 6
7 331.2483 7
8 223.3056 8
9 213.3817 9
10 331.6455 10
My Question:
- Can someone please show me how to write a function that correctly finds the nearest point from an initial point
- (Step 1) Then removes the nearest point and the initial point from "my_data"
- (Step 2) And then re-calculates the nearest point from "my_data" using the nearest point identified Step 1 (i.e. with the removed data)
- And in the end, shows the path that was taken (e.g. 5,7,1,9,3, etc)
Can someone please show me how to do this?
Thanks!