Hi All,
I have a logistic regression in R whose goal is to predict the probability of default on some test data
glm(default ~ X1 + X2 + X3 + X4 + X5 + X1:term + term:X5 - 1, family="binomial", data=mydata)
What I'd like to do is 'bin' this data so that bins 1 to n each have a certain rate of default. How can I bin the logistic regression results in this way? For example, the bins on a sample set of 1000 might look like:
I think you can do this with the predict function. The help file for predict.glm says this about the type argument:
type: the type of prediction required. The default is on the scale of the linear predictors; the alternative "response" is on the scale of the response variable. Thus for a default binomial model the default predictions are of log-odds (probabilities on logit scale) and type = "response" gives the predicted probabilities.
You should be able to use code like this to make bins.
FIT <- glm(Result ~ X1 + X2, family = "binomial", data = DF)
DF$prob <- predict(FIT, type = "response")
DF$bin <- cut(DF$prob, breaks = seq(0, 1, 0.1))