I am currently looking a CART trees in relation to variable importance
In the documentation for caret there is a function called varimp
Depending on the model it differs on how it calculates the variable importance
It also says for rpart that This method does not currently provide class--specific measures of importance when the response is a factor
When I create a rpart below I am able to use varimp. Can anyone tell me how this is calculated. Is it based on the drop in Gini Index when the variable is permutated or dropped?
Thanks for your time
library(rpart)
library(caret)
#> Warning: package 'caret' was built under R version 3.5.1
#> Loading required package: lattice
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 3.5.1
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.5.1
#> Warning: package 'dplyr' was built under R version 3.5.1
# Get the Data
data(GermanCredit)
rf_mod <- rpart(Class~.,data = GermanCredit)
caret::varImp(rf_mod) %>%
rownames_to_column() %>%
arrange(desc(Overall)) %>%
slice(1:10)
#> rowname Overall
#> 1 Amount 57.82419
#> 2 Duration 47.32593
#> 3 CheckingAccountStatus.none 43.66521
#> 4 CheckingAccountStatus.lt.0 37.87057
#> 5 CreditHistory.Critical 24.11095
#> 6 Purpose.NewCar 20.31030
#> 7 Purpose.UsedCar 18.56253
#> 8 CheckingAccountStatus.gt.200 17.39552
#> 9 OtherDebtorsGuarantors.Guarantor 11.40171
#> 10 Property.Unknown 11.25420
Recursive Partitioning : The reduction in the loss function (e.g. mean squared error) attributed to each variable at each split is tabulated and the sum is returned. Also, since there may be candidate variables that are important but are not used in a split, the top competing variables are also tabulated at each split. This can be turned off using the maxcompete argument in rpart.control . This method does not currently provide class–specific measures of importance when the response is a factor.
Classic CART trees do not use the random forest permutation methods for measuring importance (since there are no out-of-bag samples).
Thank you for the quick reply, If i understand this correctly, the varimp is using the same attributes picked by the tree that result in the most gain? Is this the improve metric from your two_split$splits call?
What is the improve metric for classification. I had a look at the code but to be honest its a bit over my head but maybe from stepping through it, is the loss function the same one used when I build the tree for example information loss?