@Max, I am stubborn, and therefore I stuck with knnreg()
through thick and thin
I used RANN
package's nn2()
function to find 10 closest neighbors for each missing data point, and only kept those (as other data points would be too far removed to be considered). This reduced the amount of existing data from 1.9M to ~200K records, and made it possible to run knnreg
I may be breaking many rules here, as I'm not an ML expert, but my primary goal is a one-time hole plugging in my data. If I could do it by hand - I would.
On the subject of many outcomes, I couldn't get x + y ~ a + b
or cbind(x,y) ~ a + b
syntax to work - it works at the knnreg
call, but breaks on predict
:
Error in knnregTrain(train = c(25, 37, 38, 26, 27, 43, 32, 32, 31, 46, : 'train' and 'class' have different lengths