Logistic regression but non independant variables

I would like to develop a binary classification model for inference and insight, as opposed to accuracy. A logistic regression would be ideal, I really wnat the coefficients in order to understand the various predictors, and effect size of holding all vars constant and incrementing another bit by bit.

However, my training observations are not indepenent of each other. According to this link:

Second, logistic regression requires the observations to be independent of each other. In other words, the observations should not come from repeated measurements or matched data.

Context is churn prediction, where a subscribers renewal date passes each month. Someone who signed up in January and is still with us in January the following year, will have 12 observations in the data, one for each time their renewal date passes.

Is there a logistic regression like model I can use that overcomes this breach of assumption and provides coefficients for analysis? Or is there an alternative model that would allow for the same depth of understanding of the relationship between the predictors and target variable?

You are describing Survival analysis. I would start with looking at Cox (proportional hazards)

Thanks for the suggestion. I considered that too and might give it a try. I was also thinking of restricting the data to the first renewal cycle of a subscriber only, so they would only ever appear once.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.