classification accuracy of safs function is less than classification with random forest model

msabam · January 26, 2020, 3:25pm

hello. when I classify my data with randomForest function and predict the test data, the accuracy and kappa are 0.96 and 0.95 respectively. for optimization, I chose safs function with rfSA method with 1000 iterations. when predicting with the test data, the accuracy and kappa are less than the classification itself. can someone please say why? because of the optimization process, the performance metrics should show the improvement of classification accuracy. my data is U.S crash records with the class "FATALS".

library(randomForest)
set.seed(1700)
forest <- randomForest(as.factor(FATALS) ~.,data=newtr, importance=TRUE)
predictionrf <-predict(forest, newte,type="class")
trf<-table(predictionrf,newte$FATALS,dnn=c("Predicted", "Actual"))
rfcm<- confusionMatrix(trf)
rfcm

sarfctrl<-safsControl(functions=rfSA,method="cv",number=10)
sarf <- safs(x = newtr[,-12], y = newtr[,12], iters = 1000,differences = TRUE,safsControl = sarfctrl)
sarfp<-predict(sarf,newte,type="class")
tsarfp<-table(sarfp$pred,newte$FATALS,dnn=c("Predicted", "Actual"))
sarfcm<-confusionMatrix(tsarfp)

Max · February 11, 2020, 4:59pm

I don't know what the training set size is but it is pretty likely that Kappa values of 0.96 and 0.95 are within their respective noise levels. I'll bet the are not different.

because of the optimization process, the performance metrics should show the improvement of classification accuracy.

We don't know anything about these data. The lack of improvement indicates that the data requires all of the predictors. If true, that is not a flaw.

system · March 3, 2020, 4:59pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.