For loop: How to apply different command for specific iteration

Sam8 · July 17, 2020, 7:57pm

Hello I have the following data frame and I would like to analyse it in for loop. the analysis is going through all samples for each iteration (gene).
my issue is that for one of the iterations (genes) I would like to exclude one sample from the analysis.
precisely for gene: Hbvrt I would like to exclude sample: s4.

## create some data
sample_ID <- rep(c('s1','s2','s3','s4'),4)
gene_ID <- c( rep('TFT',4) , rep('Hbvrt' ,4), rep('Myx4',4), rep('Rai56n',4))
readz <- runif(16, 5000, 7500) 

df <- data.frame(sample_ID , gene_ID , readz)

## start the loop 
res <- list()
for ( g in unique(df$gene_ID)){
  
  df_g <- df[df$gene_ID == g, ]
  
  df_g$Nanost <- runif(4, 5000, 7500)
  
  df_g$NEW <- df_g$Nanost / df_g$readz * 100 

  ## AND long code here ....

## function for graph ## Graphs Not shown here  
  scatter_fun = function(x, y) {
    
    ggscatter(df_g, x = "readz", y = "Nanost", 
              add = "reg.line", conf.int = TRUE, 
              cor.coef = TRUE, cor.method = "pearson",
              xlab = "readz", ylab = "Nanost")
    
  }

   res[[length(res)+1]]<-df_g
}

print(res)

[[1]]
  sample_ID gene_ID    readz   Nanost       NEW
1        s1     TFT 6577.112 6582.497 100.08186
2        s2     TFT 6914.966 6192.676  89.55468
3        s3     TFT 7494.457 6508.501  86.84420
4        s4     TFT 7069.737 5966.418  84.39378

[[2]]
  sample_ID gene_ID    readz   Nanost       NEW
5        s1   Hbvrt 6346.545 7499.966 118.17399
6        s2   Hbvrt 7368.858 6860.801  93.10536
7        s3   Hbvrt 5671.581 5604.065  98.80957
8        s4   Hbvrt 6067.496 7420.354 122.29680 ## REMOVE THIS in the result and graph

[[3]]
   sample_ID gene_ID    readz   Nanost       NEW
9         s1    Myx4 5270.035 7086.622 134.47011
10        s2    Myx4 7338.199 5670.227  77.27002
11        s3    Myx4 5596.834 5595.212  99.97101
12        s4    Myx4 5477.589 7472.254 136.41502

[[4]]
   sample_ID gene_ID    readz   Nanost      NEW
13        s1  Rai56n 6526.715 6475.832 99.22040
14        s2  Rai56n 5512.179 5137.163 93.19660
15        s3  Rai56n 6109.446 5221.244 85.46182
16        s4  Rai56n 5836.242 5602.662 95.99776

the expected output is the same except that the list number [2] to be without the row where s4 (in sample_ID column) exists. would that be possible ?

Many thanks.

woodward · July 17, 2020, 8:47pm

I think you would use an if statement. It doesn't matter if a list element stays empty.

for ( g in unique(df$gene_ID)){
  df_g <- df[df$gene_ID == g, ]
  if (!any(df_g$sample_ID column == "s4")){
    # do code
    res[[length(res)+1]] <- df_g
  }
}

Sam8 · July 17, 2020, 9:11pm

Thanks a lot for your answer. that is OK for the iteration that applies on Hbvrt . but where are the results for other genes. your code will give an empty list.

woodward · July 17, 2020, 9:27pm

Like this?

if (g != "Hbvrt")){
  df_g <- df[df$gene_ID == g, ]
} else {
  df_g <- df[df$gene_ID == g & df$sample_ID != "s4", ]
}

Sam8 · July 17, 2020, 10:03pm

Thank you that worked perfectly.

system · July 24, 2020, 10:03pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.