Help optimize nested for loops used for subsetting


As part of a larger study I'm doing a simulation of decisions in regards to subsetting of data. Currently I'm using nested for loops as shown in the example below.

However, my full code has over 1 million iterations and I am therefore trying to optimize it as much as possible to reduce execution time.

I have tried to optimize the code to the best of my knowledge and I changed to data.table and saw a small speed increase.

Some of the iterations will inevitably result in empty dataframes. I have tried to use if/else/next to stop the current iteration if the dataframe has nrow == 0 but it resulted in a marked increase in running time.
Is there any way I can optimize my code to decrease the running time?
Does it make any sense to parallelize it using foreach when the task for each iteration is so small?


my_df <- data.table(id = c("id1", "id1", "id1", "id2", "id2"),
           bin_year = c(1,1,1,2,2),
           outcome = c("outcome1", "outcome1", "outcome2", "outcome2", "outcome3"),
           bin_interv = c(1, 2, 3, 1, 2)

unq_outcome <- unique(my_df$outcome)

loop_output <- list()
for (l in 1:max(my_df$bin_year)) {
    for (o in 1:((max(my_df$bin_interv)) + 3)) {
      for (p in 1:((n_distinct(unq_outcome)) + 1)) {
        # iterations
        iteration <- str_c(l,o,p)
        # selectors
        select_year <- 1:l
        select_interv <- if (o <= max(my_df$bin_interv)) {o} else 
                         if (o == max(my_df$bin_interv) + 1 ) {c(2,4)} else 
                         if (o == max(my_df$bin_interv) + 2 ) {c(1,5)} else {1:max(my_df$bin_interv)}
        select_outcome <- if (p <= n_distinct(unq_outcome)) {unq_outcome[p]} else {unq_outcome}
        # subset data
        loop_output[[iteration]] <- my_df[bin_year %in% select_year & 
                                          bin_interv %in% select_interv & 
                                          outcome %in% select_outcome]

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.