Explaining differences in predictions from massive regression model (lm)

Hello, the powers that be decreed that my team has to work with a ridiculously large and complex regression model (1000+ predictors, interactions, etc.) .

The model input (externaly given data) changes depending on suggestions how to change the model in future regulatory settings. My team needs to assess how these changes will affect us and why certain changes within the predictions happen.

There are many, really many, interactions, levers and knobs that can influcence the outcome and even a tiny change can have big ripple effects downstream.

Before I go off and write my own suite to explain which parts of the model contribute the most to changes in predictions between model iterations I wondered if there was a package that I could start looking into for something along these lines.


The DALEX and DALExtra packages are a good place to start.

I would suggest using regularization to help with that.


Thanks, the DALEX package really looks promising. Concerning regularization, the way the regression works is mandated by law so we cannot change it.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.