crossposted from r - Why predicted values differ in knn regression when using caret vs FNN - Stack Overflow
I was trying to do some manual calculations of knn regression and came across this unusual error. The predicted values done by hand do not match with the ones I got from the 'knnreg' function in the 'caret' package. So I used another package (FNN) as a second check and discovered that my manual calculations do agree with the ones from the FNN package. So I'm really confused now. Here is an example code:
# caret vs. FNN packages
# issue in predictions
library(caret)
library(FNN)
n <- 100
x <- rnorm(n)
y <- 2 + 3*x + rnorm(n, sd = 0.5)
x <- as.matrix(x)
# using caret
knn_caret <- knnreg(x, y, k = 5)
yhat_caret <- predict(knn_caret, newdata = x)
# using FNN
knn_FNN <- knn.reg(train = x, y = y, k = 5)
yhat_FNN <- knn_FNN$pred
# manual calculation using the neighbors.
# choose a point
i <- 3
nn <- kNN(x, k = 5) # get nearest neighbors for point 'i' (using the caret package)
neighbors <- nn$id[i, ]
mean(y[neighbors]) # manual calculation
yhat_FNN[i] # FNN package
yhat_caret[i] # caret package
If you can point to any mistake that I may have made in my code or any thoughts on this issue is greatly appreciated.