# Cross Validation Plot in R

I am relatively new to some machine learning techniques such as cross-validation alongside being quite new to R programming.

However, I am interested in replicating an out-of-sample technique used by Hothorn & Zeileis (2020).

https://www.tandfonline.com/doi/full/10.1080/10618600.2021.1872581

In particular, Supplementary Manual: Section 3.3 (Figure 15) is said to be generated using an: "out-of-sample (50 times 4:1 subsampling) approach.'

I wondered if someone could post some code on how to do this type of '50 times 4:1' type cross-validation on a toy data set?

The estimation methods and performance metrics are irrelevant. Would just love to see some type of process for replicating the graph and the '50 times in 4:1' cross-validation.

Would be genuinely appreciated.

This does not include cross validation but I think shows the basics of making such a plot. The key is to have a column that identifies each point so the lines can connect the points between model types.

``````library(ggplot2)
set.seed(123)
Orig <- rnorm(100,mean = 8000,sd = 100)
DF <- data.frame(ID = rep(1:100,4),
Model = rep(c("A","B","C","D"), each = 100),
Value = c(Orig, Orig-500, Orig-700, Orig-600))
ggplot(DF,aes(x = Model, y = Value)) +
geom_line(aes(group = ID),color="grey80",alpha=0.3) +
geom_boxplot() + theme_classic()
``````

Created on 2022-02-13 by the reprex package (v2.0.1)

Thank you @FJCC . This gives me a good idea of the kinda information included in the plotting. Thank you

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.