Hi everyone,
I’m finalizing a paper in cognitive science (spatial localization/representational momentum) and am in a debate regarding my analysis pipeline. I would appreciate a sanity check on whether my Linear Mixed Effects (LME) approach is defensible or if I should revert to a standard Repeated Measures (RM) ANOVA (what's used conventionally in the literature).
Design: 2×2 Within-Subjects Factorial (Geometry × Visibility).
Structure: 3 separate experiments (N= approx 23 each).
Data: ~80 trials per participant (Total trials per experiment \approx 1800).
Outcome: Continuous (Forward Error in pixels).
What I Did (The LME Approach)
I fit a trial-level LME using lme4 / lmerTest in R.
Contrast Coding: Sum-to-zero contrasts (+/-0.5) for both factors.
Random Effects: Maximal structure (Random Intercepts + Random Slopes for Geometry, Visibility, and Interaction). I use || (uncorrelated random effects) to aid convergence if the full matrix is singular.
Model Specification
model <- lmer(error ~ Geometry * Visibility +
(1 + Geometry * Visibility || ParticipantID),
data = df,
control = lmerControl(optimizer = "bobyqa"))
Omnibus Test
anova(model, ddf = "Kenward-Roger")
(Happy to share all code if interested for debate)
The results also agree with a conventional participant-aggregated RM ANOVA. The RM ANOVA reproduces the same three significant effects (Geometry, Visibility, and their interaction). I had planned to put this RM ANOVA as an appendix sanity check and keep the LME as the main analysis. However, I am not yet fully secure in this decision, as I'm not fully secure with LMEs.
Caveat: Exploratory/secondary analysis: Because I observed noticeable individual variation, I refit models including Group (and interactions such as Geometry×Visibility×Group). So here, I am again interested in the participant.
What do you say to below argumentation for the LME vs RM ANOVA?
LME is the natural extension of RM ANOVA for trial-level repeated-measures data because it models within-subject dependence explicitly via random effects (intercepts + slopes).
It avoids listwise deletion if a participant has an incomplete cell after trial exclusion (not an issue here it seems). Unlike ANOVA, which treats all participant means as equally reliable regardless of variance or trial count, LME weighs subject-level estimates by their precision. This makes the LME estimates (betas) more robust to noisy participants than ANOVA means, even when p-values align. --> We have unbalanced data due to missing trials and 2-step trial level exclusions. Partial Pooling in LMEs, making results more robust
Trial-Level Power vs Aggregation: We have approx. 1800 trials per Exp, aggregating these down to a single mean per condition throws away information about trial-by-trial variance (we have a random speed jitter response error as well as a random starting position). The lme models raw data at trial-level, it uses all retained trials, while still testing effects against between-participant variability (via random slopes), and here it matches the RM ANOVA conclusions.
However, Occam’s Razor usually suggests using the simpler tool (ANOVA)
Efficiency (Parsimonious): with participants as fixed, RM ANOVA has to calculate the specific intercept for each--it learns the identity. The LME calculates the variance (SD) of the participants group and then has more power to detect actual experimental effects (Geometry / Visibility).
Please debate with me on the above reasoning! Thank you!