In particular, Supplementary Manual: Section 3.3 (Figure 15) is said to be generated using an: "out-of-sample (50 times 4:1 subsampling) approach.'
I wondered if someone could post some code on how to do this type of '50 times 4:1' type cross-validation on a toy data set?
The estimation methods and performance metrics are irrelevant. Would just love to see some type of process for replicating the graph and the '50 times in 4:1' cross-validation.
This does not include cross validation but I think shows the basics of making such a plot. The key is to have a column that identifies each point so the lines can connect the points between model types.
library(ggplot2)
set.seed(123)
Orig <- rnorm(100,mean = 8000,sd = 100)
DF <- data.frame(ID = rep(1:100,4),
Model = rep(c("A","B","C","D"), each = 100),
Value = c(Orig, Orig-500, Orig-700, Orig-600))
ggplot(DF,aes(x = Model, y = Value)) +
geom_line(aes(group = ID),color="grey80",alpha=0.3) +
geom_boxplot() + theme_classic()