Family Choice for GLM with 0-1 Dependent Variable

want to analyze the relationship between the dependent variable "Knowledge_index" (va numeric variable ranging between 0 and 1, both inclusive) and several independent variables (type “factor”, including "Age", "Gender", "Place_birth", "Residence_time", "Education" and "Professional_sector".

str(data$Knowledge_index) num [1:430] 0.88 0.63 0.75 0.25 1 1 0.88 0.75 0.88 0 ...

The model: glm1 <- glm(Knowledge_index ~ Age+Gender+Place_birth+Re


sidence_time+Education+Professional_sector, data = data, family = quasibinomial(link = "logit"))

Since the dependent variable has values between 0 and 1, tengo entendido que es correcto usar a quasibinomial distribution with a logit link function (family = quasibinomial(link = "logit")). Is this correct or should I use another family?

Attached is also histogram of the dependent variable.

Below I add more information about the construction of the index (the dependent variable). This index is constructed from 4 questions asked to the respondent (I1, I2, I3, I4). The calculation is summarized below Knowledge_Index = (0.25 * I1) + (0.25 * I2) + (0.25 * I3) + (0.25 * I4). For example, if the respondent answers all 4 questions correctly, the result would be Knowledge_Index = (0.25 * 1) + (0.25 * 1) + (0.25 * 1) + (0.25 * 1) = 1. If the respondent answers three questions correctly and one incorrectly: Knowledge_index = (0.25 * 1) + (0.25 * 1) + (0.25 * 1) + (0.25 * 0) = 0.75

Thank you very much in advance!!

If Knowledge_Index represents a proportion (of say something like total knowledge), the quasibinomial family with the logit link is appropriate.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.