Hi community,
To analyse my data, a multivariate probit model is suggested in the literature. I have six dependent variables and up to 7 independent variables. I have already coded the data in Excel and then read it into R for the analysis. At first I got no output at all, but in the meantime I managed to get something. Unfortunately, the following error message appears at the end and the data are not comprehensible:
When I enter warnings(), the same error message always comes up, namely:
The correlation matrix is not positive definite
and this many more times.
I don't quite understand it, because I did it the same way as in the paper "What hampers innovation? Revealed barriers versus deterring barriers" by D'Este et al. (2012).
ChatGPT couldn't help me any more either.
This is the code I need for the model:
model.MVP2 <- mvProbit(cbind(Cost.B, Knowledge.B, Market.B, Regulation.B, Data.B, Trust.B) ~ Little + Average + High + Very_High + LN_Employees + Higher_Education + RD_Invest, data = data)
summary(model.MVP2)
The data are all binary coded except for "LN_Employees + Higher_Education + RD_Invest", but this should not be a problem.
Assuming you do have a large sample, I wonder if including Little + Average + High + Very_High is creating a dummy variable trap? (Although usually R is good at handling this somewhat gracefully.)
No, they shouldn't. Since you have 4 dummies for 5 categories you are not in the dummy-variable trap. In principle, this should work.
One possibility is that one of the 5 categories is extremely rare and maybe this leads to estimation difficulties. How many fall into each of the five categories?
@Rstarterpack I don't see that you're doing anything wrong. Perhaps fold the very high group into the high group and then do individual probits. Then maybe use what you find to get starting values for the multivariate version.