Hi, I'm running a regression on the insurance data. I'm doing random forest and I get the following error:
"Warning message:
In mean.default((ytest - pred.bagging)^2) :
argument is not numeric or logical: returning NA"
my code is
library(randomForest)
set.seed(888888)
xtrain<-M[1:1000,]
ytest<-M[1001:1338,]
fit.bagging<-randomForest(charges~.,data=xtrain,importance=TRUE)
fit.bagging
I haven't used the randomForest package so I am working from general experience.
Shouldn't this line
pred.bagging<-predict(fit.bagging,data=(-xtrain))
be
pred.bagging <- predict(fit.bagging,data=(ytest))
That is, do the prediction with the data that you held back. Also, I would expect that the parameter for passing the data in predict() to be called newdata not data but I could easily be wrong about that.
In the line
MSE.bagging <- mean((ytest-pred.bagging)^2)
you are subtracting the pred.bagging values from an entire data.frame, ytest. Shouldn't you be referring to the predicted values, charges, from ytest?
would you like to be my friend? if yes, let me know please. I believe I owe you a lot. you saved me again.
Thank you for your time and help and kindness FJCC :* love u
Sorry to bother u again, but do you think the code would be the same for classification too?
I am not sure of the changes needed for classification. I would inspect the output of predict() to make sure you get the predicted class. I have a vague memory of having to change an argument in predict() to get that but I was not using randomForrest at the time. Also, MSE is not appropriate for describing a classification problem.