I am trying to compare between 3 groups of participants (90 participants) according to their performance (i.e., correct identification of safety hazards around the home - 0/1) while watching movie footage. Each participant watched the same movie, which contained 40 hazards from different types (5 hazard types).

That is, I am trying to assess performance according to participant's group and hazard type.

My data looks like this:

participant

group

hazard_type

response

1

1

4

1

1

1

2

1

1

1

4

0

1

1

3

0

1

1

3

0

2

1

4

0

2

1

3

0

2

1

3

0

2

1

1

1

3

2

1

0

3

2

1

0

3

2

1

0

I am not sure how to structure the model/what to enter as a fixed or random effect.

I am trying to perform a glmm where the ‘group’ is defined as a fixed effect, ‘participant’ is defined as random effect and ‘hazard type’ is defined as repeated measures (as each participant identified several hazards with several hazard types in the movie). I am trying something along the lines of:

Your model is almost fine, albeit misspecified because if you deal with binary outcome, so you should define an adequate family, e.g. family = binomial().

For fix vs random, this is a complex topic but briefly, it depends on what you want:

Declaring an effect as random is recognizing that its levels are just a sample among a larger population of levels. So here treating hazard_type as random would imply that the 5 hazard types are a random sample of many hazard type. Since you use a simple glmer call, it would also imply that the effect of each hazard type on the logit (under a binomial case, so logistic regression) is normally distributed in the population of hazard types (this is key and a main point of random models: you save parameters at the cost of making assumption about the distribution of the effects). Another thing is that this model will estimate the variance of the effects of such hazard type but you won't directly have direct estimates of the effect of each type. You can infer them as posteriori (it is called BLUPs) but I would not recommend that in your case. Since a random effect estimates a variance, one may also wonder if a sample of 5 types is sufficient to estimate the variance between type (I think it is borderline).

So the question is are those the 5 types there are, if so model them as fixed, or are those 5 types out of many other types of hazard. If so, stick to random but perhaps try to sample some more hazard type...

For participants, it is a no brainer: random makes sense.

Regarding ‘hazard type’: Indeed, the five hazard types are all there is (i.e., fixed), however, each participant watched all of the hazards and identified several hazards (would that imply the necessity of using repeated measures?).

I am not sure how to model this factor. How would you suggest adding it to the glmer?

Then, aside the missing family argument you should be good! The random effect accounting for the variation between participant is a legetimate way to account for the pseudoreplication in your data!