Modelling Post Probability Adjustments

Hi,

I have been looking at the latest updates for tidymodels around smoothing the probabilities. I have what might seem like a very basic question. If i have a model where on training produces a sensitivity 60% and recall of 90%. If i then put this model into production and it produces a score on an item of 95%. Is this the true probability.

Im not a huge probability expert but was wondering when the model makes its prediction of 95% does it factor in the precision/recall of the model from the training.

Is it worth if I know the prior probability of the disease/failures/rate in the population to use bayes on top of the probability outputted by the model to give true probabilities.

P(Disease|Test) = (P(Test|Disease) * P(Disease)) / P(Test)

Thank you for your time

First, when you use the tidymodels calibration code, it will recompute the hard class probabilities. So if the probability was 0.3 and calibration makes it 0.7, we change the hard class prediction from a nonevent to an event. This assumes a 50% cutoff.

Precision/recall/sensitivity/specificity use the hard class predictions so they don't really care what the probability value is as long as it is greater than the cutoff.

Those Bayesian performance estimates are the positive (and negative) predictive values.

If you want to make sure that the prior is used by the model to produce probabilities, then that is complicated because some models make no probability assumptions.

Some of the modeling functions allow you to specify the prevalence/prior as an argument. MASS::lda() is an example. The default is to use the training set data to estimate the prevalence (aka "empirical Bayes").

1 Like

Thank you @Max as always for your help

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.