str function mentions stats but how do I create the stat output using caret?

I have the following code:

model <- as.formula("y~z+x2+x4+x5+x6+x10+x11")
set.seed(12345)
suppressMessages(library(caret))
train.control <- trainControl(method="repeatedcv", number=10, repeats=3)
cv <- train(model,data=df2, method="glm", trControl=train.control)
cat("RMSE =",cv$results$RMSE,"\n")
cat("Rsquared =",cv$results$Rsquared,"\n")
cat("MAE =",cv$results$MAE,"\n")

The following statements result in NULL (each one)

cv$varImp
cv$aic
cv$deviance

When I see something listed in str(cv) am I to assume it can be available?
varImp, aic and deviance are included in the list.
The package documentation was sparse on this topic.

Hi, you can extract the AIC and deviance like this. I have used the iris dataset to make it reproducible.

model <- as.formula("Sepal.Length ~ Sepal.Width + Petal.Length")
set.seed(12345)
suppressMessages(library(caret))
train.control <- trainControl(method="repeatedcv", number=10, repeats=3)
cv <- train(model, data=iris, method="glm", trControl=train.control)

# view summary
summary(cv)

# extract these
cv$finalModel$aic
cv$finalModel$deviance

image

William,

Do you know why cv$ModelInfo$varImp does not work? Thanks for aic and deviance example. I tried to expand on that.

Mary

str(cv)
List of 24
 $ method      : chr "glm"
 $ modelInfo   :List of 15
  ..$ label     : chr "Generalized Linear Model"
  ..$ library   : NULL
  ..$ loop      : NULL
  ..$ type      : chr [1:2] "Regression" "Classification"
  ..$ parameters:'data.frame':	1 obs. of  3 variables:
  .. ..$ parameter: chr "parameter"
  .. ..$ class    : chr "character"
  .. ..$ label    : chr "parameter"
  ..$ grid      :function (x, y, len = NULL, search = "grid")  
  ..$ fit       :function (x, y, wts, param, lev, last, classProbs, ...)  
  ..$ predict   :function (modelFit, newdata, submodels = NULL)  
  ..$ prob      :function (modelFit, newdata, submodels = NULL)  
  ..$ varImp    :function (object, ...)  

cv$ModelInfo$varImp

Output
NULL

I think it is because varImp()is a function.

You can run varImp(cv) though.

image

Though maybe it should be this: cv$modelInfo$varImp(cv)

image

William,

Note the definition of variable importance.
variable importance - mean decrease in node impurity (and not the mean decrease in accuracy)

To me this would imply that cv$modelInfo$varImp(cv) is the correct syntax. Getting the right stats out of a str function is tricky.

This is what I tried. Note the varImp values differ.

cv$modelInfo$varImp(cv)  # 8-> 39.7
varImp(cv)               # 100 -> 0

I also looked at:

library(caret)
library(randomForest)
varImpPlot(cv,type=2) # Note: This function only works for objects of class `randomForest'

From this I would conclude that cv$modelInfo$varImp(cv) is correct. Thank you replying.

1 Like