Hello I have the following data frame
and I would like to analyse it in for loop
. the analysis is going through all samples for each iteration (gene).
my issue is that for one of the iterations (genes) I would like to exclude one sample from the analysis.
precisely for gene: Hbvrt I would like to exclude sample: s4.
## create some data
sample_ID <- rep(c('s1','s2','s3','s4'),4)
gene_ID <- c( rep('TFT',4) , rep('Hbvrt' ,4), rep('Myx4',4), rep('Rai56n',4))
readz <- runif(16, 5000, 7500)
df <- data.frame(sample_ID , gene_ID , readz)
## start the loop
res <- list()
for ( g in unique(df$gene_ID)){
df_g <- df[df$gene_ID == g, ]
df_g$Nanost <- runif(4, 5000, 7500)
df_g$NEW <- df_g$Nanost / df_g$readz * 100
## AND long code here ....
## function for graph ## Graphs Not shown here
scatter_fun = function(x, y) {
ggscatter(df_g, x = "readz", y = "Nanost",
add = "reg.line", conf.int = TRUE,
cor.coef = TRUE, cor.method = "pearson",
xlab = "readz", ylab = "Nanost")
}
res[[length(res)+1]]<-df_g
}
print(res)
[[1]]
sample_ID gene_ID readz Nanost NEW
1 s1 TFT 6577.112 6582.497 100.08186
2 s2 TFT 6914.966 6192.676 89.55468
3 s3 TFT 7494.457 6508.501 86.84420
4 s4 TFT 7069.737 5966.418 84.39378
[[2]]
sample_ID gene_ID readz Nanost NEW
5 s1 Hbvrt 6346.545 7499.966 118.17399
6 s2 Hbvrt 7368.858 6860.801 93.10536
7 s3 Hbvrt 5671.581 5604.065 98.80957
8 s4 Hbvrt 6067.496 7420.354 122.29680 ## REMOVE THIS in the result and graph
[[3]]
sample_ID gene_ID readz Nanost NEW
9 s1 Myx4 5270.035 7086.622 134.47011
10 s2 Myx4 7338.199 5670.227 77.27002
11 s3 Myx4 5596.834 5595.212 99.97101
12 s4 Myx4 5477.589 7472.254 136.41502
[[4]]
sample_ID gene_ID readz Nanost NEW
13 s1 Rai56n 6526.715 6475.832 99.22040
14 s2 Rai56n 5512.179 5137.163 93.19660
15 s3 Rai56n 6109.446 5221.244 85.46182
16 s4 Rai56n 5836.242 5602.662 95.99776
the expected output is the same except that the list number [2] to be without the row where s4 (in sample_ID column) exists. would that be possible ?
Many thanks.