Hi all,
I've got a quick question on the RMSE calculations in the Nested Resampling Tutorial.
The RMSE for the nested procedure is estimated as the average across the outer resamples applying the best tuning parameters selected using the respective inner resample. However, this means that different outer splits will be fit using different tuning parameters (as not all inner resamples chose the same parameter). The 'final' tuning parameter is then the one chosen most frequently in the inner resamples.
On the other hand, the RMSE for the non-nested procedure is the smallest RMSE from fitting the tuning parameters to the outer resamples, and the best tuning parameter is the one that provided that RMSE.
But don't the two (nested vs non-nested) RMSEs indicate the error of different models? The nested RMSE indicates the predictive error for applying an SVM, whereas the non-nested RMSE indicates the predictive error from applying an SVM which a specific cost value (2 in this case).
Are the two RMSEs really comparable? Should the nested RMSE be fitting a single best cost value (the one most frequently chosen in the inner resamples) to each of the outer resamples?
Apologies for the block of text and for any basic misunderstanding I might have here!