Multinomial Model plot in R

jobu · August 16, 2020, 2:59pm

Hey there,

I try to plot a multinomial regression. However, something does no work the way I wished...

The formula is: mod.fak_diff_1996.7 <- multinom(factor(fac.num)~ factor(revenu), data=voxit_destill_1996)

fac.num is a nominal variable with 3 values, revenu is a ordinal scale which is a measure for income. As I would like to visualise the probability of being part of fac.num(1) having an income value 1, 2, 3...
I tried with this plot, but there is something wrong/missing...

pred.probs <- predict(mod.fak_diff_1996.7, type = "probs")

plot.probs.1996 <- ggplot(pred.probs, aes(x = revenu, color = level)) + eom_point(aes(y=mean)) + geom_errorbar(aes(ymin = lower, ymax = upper)) + theme_minimal() + ylab("Vorausgesagte Wahrscheinlichkeit") + xlab("Einkommen") + scale_y_continuous(labels = percent)

Would be very greatful if someone could help me
Thanks,
Johanna

AlexisW · August 16, 2020, 6:48pm

What is the error message? Can you provide some test data? Is the error due to the typo in eom_point()?

jobu · August 16, 2020, 6:53pm

Hey!

Unfortunately, the typo is just in my message here...
The error message is: data must be a data frame, or other object coercible by fortify(), not a numeric vector Have you any idea what I did wrong here?

Best,
Johanna

AlexisW · August 16, 2020, 7:11pm

Well that suggests that predict only returned a vector (or matrix) of probabilities, and not a data frame. You can visualize the content of pred.probs to check that, and see if you need some additional computation to obtain the variables mean, lower, upper and level that you want to plot. You have to format them as a data.frame and then give them to ggplot().

jobu · August 16, 2020, 7:12pm

hm, how would I do that?

AlexisW · August 16, 2020, 7:13pm

To display the content of pred.probs, just type it in the console and execute it.

jobu · August 16, 2020, 7:18pm

Well then I get a probability for each case and each of my three groups... This is what I need, no? So, I don't get it, why I cannot create the graph...

AlexisW · August 16, 2020, 7:19pm

What does it look like? Which multinom function are you using, the one from nnet? What are the variables level, mean, upper and lower that you want to plot (the mean of what?)?

jobu · August 16, 2020, 7:32pm

yes, the one from nnet. It should look like here at the end: https://www.politikwissenschaften.ch/search.php?q=multinom

AlexisW · August 16, 2020, 7:56pm

Are you specifically referring to the post Multinomiale Logistische Regression in R from May 16th? Then there is a trap: he doesn't use the predict() function that is provided by nnet, but he uses the function predicts() that is implemented in his package glm.predict. That function seems to compute automatically a bunch of other statistics that are ready for plotting:


test_data <- data.frame(fac.num = sample(LETTERS[1:3], 5, replace=TRUE),
                                 revenu = sample(letters[1:3], 5, replace=TRUE))

mod <- nnet::multinom(factor(fac.num)~ factor(revenu), data=test_data)
predict(mod, type = "probs")
#            A            B            C
# 1  0.9998222 0.0001139447 6.387267e-05
# 2  0.5000312 0.2499988295 2.499700e-01
# 3  0.5000312 0.2499988295 2.499700e-01
# 4  0.2500355 0.5000046104 2.499599e-01
# 5  0.2500355 0.5000046104 2.499599e-01
# 6  0.5000312 0.2499988295 2.499700e-01
# 7  0.5000312 0.2499988295 2.499700e-01
# 8  0.2500355 0.5000046104 2.499599e-01
# 9  0.2500355 0.5000046104 2.499599e-01
# 10 0.9998222 0.0001139447 6.387267e-05
glm.predict::predicts(mod, values = "F")
#        mean        lower     upper factor(revenu level
# 1 0.2968459 1.843498e-71 1.0000000             a     A
# 2 0.3254653 8.661279e-89 1.0000000             a     B
# 3 0.3776888 6.733555e-97 1.0000000             a     C
# 4 0.4433916 9.687225e-02 0.8230188             b     A
# 5 0.2714595 2.324654e-02 0.7715126             b     B
# 6 0.2851489 2.135547e-02 0.7699831             b     C
# 7 0.2261655 3.432994e-02 0.6102105             c     A
# 8 0.4740600 3.614687e-02 0.9152489             c     B
# 9 0.2997745 1.472718e-02 0.8785041             c     C

AlexisW · August 16, 2020, 7:59pm

And most importantly:

class(predict(mod, type = "probs"))
# [1] "matrix" "array" 
class(glm.predict::predicts(mod, values = "F"))
# [1] "data.frame"

jobu · August 16, 2020, 7:59pm

Yes, I'm confuse because I worked with the glm.predict package and it did not work...

AlexisW · August 16, 2020, 8:05pm

Again, without error messages, test data, and exact commands, I can't tell. It seems to work pretty well on my dummy data.

I'd suggest taking a good look at the documentation of glm.predict to understand the important parameters. If you still can't get it to work, post a new question with a clear description of what you tried (a reprex) and what the error message is.

kenbutler · August 22, 2020, 10:11pm

the output from predict contains only the probabilities of being in the various classes, given the input variiable values. You don't have those values because they were in the original dataframe, not in pred.probs.

One thing to try is to use cbind to glue together voxit_destill_1996 and pred.probs (not bind_cols because pred.probs is a matrix rather than a dataframe) and then make your plot, since then you'll have your inputs and your predictions in one place.

Is the plot on page 190 of http://ritsokiguess.site/STAD29/slides_d29.pdf anything like what you're trying to make? I'm not sure you can do confidence intervals on these very easily.

system · September 12, 2020, 10:11pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.