Yes, I am familiar with logistic regression (e.g. odds ratio). What I don't understand is how I can define the high vs low predicted probability cutoff in my data. Any advice is much appreciated.
Likelihood ratio analysis is a way to compare two models, especially if the models are nested. For example, if model 1 has terms A and B and model 2 just has A, a likelihood ratio test (LRT) gets the likelihood for each model and compares them.
The likelihood can be thought of as a measure of how well the parameters fit the data. As an analogy, in simple linear regression, the likelihood is basically the sums of squared errors (SSE). I'll use that as an example below but the likelihood for other distributions are not as intuitive but basically work the same way.
For the two models, suppose the model 1 SSE is 10.0 and the model 2 SSE is 12.0. Since the model is "slightly better" when predictor B is included, you might think that model 1 is better. The question is how much better is good enough.
The math behind the LRT looks at the change in the likelihood relative to the difference in the number of model terms that were different (a.k.a. the degrees of freedom). If B is numeric, the is one less parameter in model 2 so it is a single degree of freedom test. With this, and the sample size, we can see if a difference of 2.0 is significant (say via a p-value).
That's a conceptual overview. For R code, I would suggest starting with Faraway's book.
Hi Max, do you think there are two types of likelihood ratios? I understand the type you described but it seems to be different from the one used in the referenced article in my original question (e.g. how does the informative category fit into your example)?
Do you know how to select the predicted probability in the risk stratification table in the paper I cited in my original post (see Table 4)? Is the selection based on clinical knowledge?