I was asked such a question recently from one of our partners. They apply a very traditional approach of statistical modelling where they build a logistic regression model, arrive at the final form by using stepwise (that was shocking for me - who even does that?), make sure all the p-values are met, along with univariate Ginis and check of course the linearity of log-odds (as well as a couple of other assumptions).

As a response I proposed that a different linear algorithm could be used such as: lasso, ridge or simply elastic-net to arrive at a more powerful model with good generalization properties. As I was commonly used to by reading all different materials on the topic, by applying such a shrinkage algorithm we do not need to worry so much about some of the points mention before, because the embedded feature selection aspect of the model takes care of that to a great extent.

However, I was taken by surprise a bit when the partner asked me to compare both models statistical properties and e.g. verifying the linearity of log-odds of the shrinkage model. Would that actually even make sense? Today's goal orientated predictive modelling aims at maximizing predictive performance, of course with a proper train, CV, test scheme, and is not bothered so much about meeting theoretical assumptions.

On the Stata support website there is a list of 10 reasons why stepwise regression is bad. This was compiled by Frank Harrell in 1996. Frank Harrell is famous in the R community and contributed many packages, including Hmisc

Finally, this question on StackExchange asking for advantages of stepwise regression - every answer explains why stepwise is a bad idea.

I think the consensus is that stepwise regression is fast. So if you want to get the wrong results fast, use stepwise.

If you want a useful model that generalizes well, look at lasso regression instead.

Thanks @andrie, I completely agree with the statement that applying regularization methods should be the way to go - no question about it! The list of stepwise flaws is definitely very helpful to convince the partner which method is superior.

However, I'm also very convinced that the partner will be looking for a very detailed documentation of validating lasso/ ridge model properties/ requirements (similarly to what can be done for logistic regression, and mentioned in my original question). So let me rephrase my question:

Are there really any such properties/ requirements to validate apart from general model performance test?

If there are any statistical properties/ requirements, what exactly should be tested?

Resampling would be your best bet here. This would capture the randomness in selected variables by all of the methods and you can use it to make probabilistic statements about the model's performance as well as the differences between models.

One other thought about validation is that none of these methods (esp stepwise) have good methods of computing statistical significance of the predictors. Stepwise will give you all of the statistics but they are incorrect since they don't factor in the uncertainty of all of the models that came before. There are Bayesian analogs and these would produce posterior distributions for each predictors but, other than that, if you want statistical significance statements, you won't get it from these.