Hi everyone! This is a question that combines questions about {caret}, {nnet}, multinomial logistic regression, and how to interpret the results of the functions of those packages.
I am trying to calculate and interpret the variable importance of a multinomial logistic regression I built using the multinom() function from the {nnet} R package. I want to measure the variable importance of each predictor variable contributing to the outcome variable, and the documentation of {caret} says that its function varImp() can do that. On the surface the code works in terms of generating some importance values, but what it doesn't do (I think - in the documentation or the function itself) is tell me how these values are calculated or what they actually are.
Here's my attempt at a reprex:
library(tidyverse)
library(nnet)
library(caret)
fit <- multinom(Species ~ ., data = iris) # fit model
varImp(fit)
My question is - what do these numbers mean, or how can I find out what they mean? (I've tried the package documentation) Is there an alternative way where I can get an estimate of the relative variable importance?
Thank you!
(Sorry - I've posted this question once on Stack Overflow but didn't get answer...)
You say that you post the same issue on stack. Can u link it here? See the doc in the faq about cross-post
Now your question. The VarImp here is the sum of absolute value of coef of a variable.
# library(tidyverse)
library(nnet)
library(caret)
#> Le chargement a nécessité le package : lattice
#> Le chargement a nécessité le package : ggplot2
fit <- multinom(Species ~ ., data = iris) # fit model
#> # weights: 18 (10 variable)
#> initial value 164.791843
#> iter 10 value 16.177348
#> iter 20 value 7.111438
#> iter 30 value 6.182999
#> iter 40 value 5.984028
#> iter 50 value 5.961278
#> iter 60 value 5.954900
#> iter 70 value 5.951851
#> iter 80 value 5.950343
#> iter 90 value 5.949904
#> iter 100 value 5.949867
#> final value 5.949867
#> stopped after 100 iterations
fit
#> Call:
#> multinom(formula = Species ~ ., data = iris)
#>
#> Coefficients:
#> (Intercept) Sepal.Length Sepal.Width Petal.Length Petal.Width
#> versicolor 18.69037 -5.458424 -8.707401 14.24477 -3.097684
#> virginica -23.83628 -7.923634 -15.370769 23.65978 15.135301
#>
#> Residual Deviance: 11.89973
#> AIC: 31.89973
The method used in caret (and vip IIRC) is based on a paper by Gevrey et al (2003) for neural networks that uses weighted averages of the model coefficients.
Thanks @Rodrigue ! Good point noted on the cross-posting, I've now added the link to the original Stack Overflow post; I did let 23 days elapse before coming here - had no intention of spamming at all! Will watch out for this in the future.
Thanks for answering my question, that's really helpful!!