Rstudio does not show the estimates for interaction terms and weirdly shows fixed effect estimates

Hi! Im currently experiencing issues with a regression in R studio, where the regression results are not shown as expected. The following is the regression model:

lm(formula = bid_ask_spread ~ time + after + after:time + as.factor(firm_no) + as.factor(country_no), data = df_complete)

The following are the first few lines of the results (there are many more fixed effects):

Residuals:
Min 1Q Median 3Q Max
-6420.0 -102.0 -8.3 72.0 19165.4

Coefficients: (11 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.396e+04 6.514e+02 -21.434 < 2e-16 ***
time 1.163e-05 5.428e-07 21.435 < 2e-16 ***
after 4.680e+02 1.227e+01 38.141 < 2e-16 ***
as.factor(firm_no)2 1.453e+00 6.920e+01 0.021 0.983243
as.factor(firm_no)3 3.280e-01 6.920e+01 0.005 0.996218

These results are unexpected in two ways. (1) the estimate for the interaction term is missing and (2) R does not show the fixed effects for firm_no 1.

Could you help me find the reason for this? i'd appreciate massively.

Thanks and have a good day,
Ben

Dummy variable trap.

You can't have both an intercept and a complete set of dummies. That would be redundant.

Could you eloborate on that?
To my understanding, the dummy variable trap occurs, when two dummys are perfectly collinear, however, in this regression "after" is the only dummy variable.

Are you refering to all the fixed effects? They could propably just replace the intercept, i will find out how to remove the intercept estimate.

How does this play into the interaction term?

You can either have an intercept and leave one dummy out of each category or leave out the dummy, have one category with a complete set of dummies and others with one left out.

With the interactions, there is also the possibility (depends on the data) that some interacted categories are redundant.

lm() uses the first level of each factor as the baseline and the effect of all other levels in the factor are relative to that. So as.factor(firm_no)2 has an effect of 1.453 relative to as.factor(firm_no)1.

1 Like

Did the message "(... not defined because of singularities)" still appear if you remove as.factor(country_no)?

It should not be an issue but I am curious: Why did you use time + after + after:time instead of time*after?

By the way, I did some quick check and it seems that the interaction term, if shown, tends to be shown after all other lower order terms in those checks. Not sure if this is relevant.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.