Hello everyone!
I have to do a multinominal logistic regression for my bachelor thesis, since i got a multinominal dependent variable and more metric independent variables.
i used the following script, but my accuracy is at 56.4 and this is really low, right?
(I´m an absolute r studio beginner).
Can i still use the data? or is there a way to improve the accuracy of the test?
i already checked for multicollinearity and i deleted the outliers (before i deleted them i had higher accuracy, should i let them in?
i followed this site:
https://datasciencebeginners.com/2018/12/20/multinomial-logistic-regression-using-r/
I´d be really happy for every advice !
Your´s sincerly, Lea
tabelle<- read.csv("tabelleohneausreißer.csv",sep=";")
fix (tabelle)
train <- sample_frac(tabelle,0.7)
sample_id <- as.numeric(rownames(train))
test <- tabelle [-sample_id,]
train$a<- relevel(train$a,ref="Vegetation")
require(nnet)
multinom.fit <- multinom(a~+b+c+d+e -1,data=train)
weights: 18 (10 variable)
initial value 274.653072
iter 10 value 220.480331
final value 220.404239
converged
summary (multinom.fit)
Call:
multinom(formula = a ~ b + c + d + e + f - 1,
data = train)
Coefficients:
b c d e
Merkmala -0.002147572 -0.04805716 -0.9840262 -1.150613
f 0.001257216
Merkmalb 0.104333462 0.04476998 -2.1787026 1.591302 -0.004514558
Std. Errors:
b c d e
Merkmala 0.02384957 0.03294757 0.8306937 0.9734560 f
0.0009642184
Merkmalb 0.03884706 0.05458862 0.1015885 0.2658659 0.0014501817
Residual Deviance: 440.8085
AIC: 460.8085
exp(coef(multinom.fit))
b c d e f
Merkmala 0.9978547 0.9530793 0.3738031 0.3164429 1.0012580
Merkmalb 1.1099705 1.0457873 0.1131883 4.9101396 0.9954956
head (probability.table <- fitted(multinom.fit))
Vegetation Merkmala Merkmalb
1 0.6068753 0.3085744 0.08455033
2 0.4620414 0.3012380 0.23672058
3 0.6740011 0.2457066 0.08029228
4 0.6229242 0.2467514 0.13032444
5 0.5430746 0.4423710 0.01455433
6 0.6343817 0.3119502 0.05366813
train$precticed <- predict(multinom.fit,newdata=train,"class")
ctable <- table(train$a, train$precticed)
ctable <- table(train$a, train$precticed)
round((sum(diag(ctable))/sum(ctable))*100,2)
[1] 56.4
test$precticed <- predict(multinom.fit,newdata=test,"class")
ctable <- table (test$a,test$precticed)
round((sum(diag(ctable))/sum(ctable))*100,2)
[1] 42.99
min (Inf)