How to run a multilevel regression model with crossed random effects: items and participants?

Paige_Cox · May 16, 2022, 1:25pm

I am trying to run a multilevel regression for my study:
I have two random effects; participants (97) and items (which are the 20 words used in the study)
Each participant had to spell the same words.

My outcome variable is spelling accuracy and has 2 levels- 1 for correct, and 0 for incorrect.

My predictor variables are all continuous, these are word features and include word length, word frequency, OLD20(neighbourhood density), and OLDF(neighbourhood frequency)

I want to use the raw spelling scores in a regression model without aggregating them first which is why I need to use a multilevel regression model.

I'm trying to figure out the correct code to use but haven't had any luck so far. This is what I've got:

M1 <-lmer(spelling_accuracy ~ 1 + OLD20 + OLDF + wordlength_letter + word_frequency + (1|items) + (1|participants), data = combined_df, REML = FALSE)

This gives me the following warning:

fixed-effect model matrix is rank deficient so dropping 18 columns / coefficients
boundary (singular) fit: see help('isSingular')

Then when I check the model by running:

library(jtools)
summ(M1)

I get the following:

MODEL INFO:
Observations: 1940
Dependent Variable: spelling_accuracy
Type: Mixed effects linear regression 

MODEL FIT:
AIC = 1658.29, BIC = 1786.41
Pseudo-R² (fixed effects) = 0.14
Pseudo-R² (total) = 0.53 

FIXED EFFECTS:
----------------------------------------------------------
                     Est.   S.E.   t val.      d.f.      p
----------------- ------- ------ -------- --------- ------
(Intercept)          0.52   0.05    11.01    397.48   0.00
OLD201.7             0.19   0.05     3.79   1843.00   0.00
OLD201.85            0.26   0.05     5.26   1843.00   0.00
OLD201.9             0.14   0.05     2.94   1843.00   0.00
OLD201.95            0.22   0.05     4.42   1843.00   0.00
OLD202.0             0.12   0.05     2.52   1843.00   0.01
OLD202.25           -0.40   0.05    -8.20   1843.00   0.00
OLD202.35           -0.01   0.05    -0.21   1843.00   0.83
OLD202.45            0.07   0.05     1.47   1843.00   0.14
OLD202.5            -0.01   0.05    -0.21   1843.00   0.83
OLD202.65            0.16   0.05     3.37   1843.00   0.00
OLD202.7            -0.02   0.05    -0.42   1843.00   0.67
OLD202.9            -0.06   0.05    -1.26   1843.00   0.21
OLD203.0             0.20   0.05     4.00   1843.00   0.00
OLD203.05           -0.44   0.05    -9.05   1843.00   0.00
OLD203.35           -0.02   0.05    -0.42   1843.00   0.67
OLD203.4            -0.16   0.05    -3.37   1843.00   0.00
OLD203.5             0.15   0.05     3.16   1843.00   0.00
OLDF12.7            -0.21   0.05    -4.21   1843.00   0.00
OLDF4.6              0.04   0.05     0.84   1843.00   0.40
----------------------------------------------------------

p values calculated using Satterthwaite d.f.

RANDOM EFFECTS:
----------------------------------------
    Group        Parameter    Std. Dev. 
-------------- ------------- -----------
 participants   (Intercept)     0.31    
    items       (Intercept)     0.00    
   Residual                     0.34    
----------------------------------------

Grouping variables:
--------------------------------
    Group       # groups   ICC  
-------------- ---------- ------
 participants      97      0.45 
    items          20      0.00 
--------------------------------

I'm not sure why it's showing only OLD20 in the model outcome, and why it's presenting it as different levels. I have made sure to code the variables correctly:

as.numeric(combined_df$OLD20)
as.numeric(combined_df$OLDF)
as.factor(combined_df$spelling_accuracy)

Any help with fixing my code for the model would be really appreciated.

system · June 6, 2022, 1:26pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.