Parallelizing nested loops

I am trying to increase the execution speed of the code below, but I am not sure whether to parallelize the outermost loop only or the outer loop and inner loops. I am working on Ubuntu with 2 processors and I do not know how many threads each processor would create to conduct this task and whether the spanning of many threads would bring any complications that I should be aware of and control with locks. What would you recommend? Thank you


nc = detectCores()
cl = makeCluster(nc, typr = β€œFORK”)

pts <- list(chunkSize=2)

    foreach (H in 0:HexC, .combine = β€œc”) %:%{
        for (HN in 0:HNcC,  .Combine = β€œc”) %dopar%{
            for (F in 0:FucC, .Combine = β€œc” ) %dopar%{
                for (SA in 0:SAC, .Combine = β€œc”) %dopar%
                    for (SO3 in 0:SO3C,{
                        NAmax<- sum(SA+SO3)
                        for (NAD in 0:NAmax, .combine = β€œc”) %dopar%{
                            Na_Cnt<- c(Na_Cnt, NAD)
                            SO3_Cnt<- c(SO3_Cnt, SO3)
                            SA_Cnt<- c(SA_Cnt, SA)
                            Fuc_Cnt<- c(Fuc_Cnt, F)
                            HexNAc_Cnt<- c(HexNAc_Cnt, HN)
                            Hex_Cnt<- c(Hex_Cnt, H)

                            Na_Mass<- c(Na_Mass, NAD*NaAdductMass)
                            SO3_Mass<- c(SO3_Mass, SO3*dels["SO3"])
                            SA_Mass<- c(SA_Mass, SA*dels["SA"])
                            Fuc_Mass<- c(Fuc_Mass, F*dels["Fuc"])
                            HexNAc_Mass<- c(HexNAc_Mass, HN*dels["HexNAc"])
                            Hex_Mass<- c(Hex_Mass, H*dels["Hex"])

if you only have 2 processors and this is running locally, it doesn't make a lot of sense to run in parallel. You could never conceivably get your run time any faster than 50% of your single threaded run time. And that's a theoretical max assuming no parallelization overhead, which exists.

In addition, what makes your run time slow in this loop? From your code I can't know how big any of these objects are or what steps take time. But it looks like this loop is lots and lots of combining and subsetting.

If there were gains to be made from parallelization here (I'm not sure there are), you would get them by only parallelizing the loop that takes time. Hardly anything happens in your loops until the inner loop. (with the exception of NAmax<- sum(SA+SO3). So there's no benefit to spewing these out in parallel. Every time you fork to make the process parallel you have to deal with some overhead. You don't want to deal with the overhead unless you are going to get meaningful gains from it.


If I understand correctly, the top part of your loop just makes vectors for every possible combination of some variables. The expand.grid() function is perfect for this:

my_data <- expand.grid(
  H   = 0:HexC,
  HN  = 0:HNcC,
  `F` = 0:FucC,
  SA  = 0:SAC,
  SO3 = 0:SO3C

I don't have the data for the bottom half of the inner loop, but I'd bet it could be expressed as adding and manipulating columns of the above data.frame.

Edit: I don't mean this to be "You're not following the sacred idioms!" It should help speed up your code, especially if the number of combinations is very large. "Growing" vectors is a very time-expensive approach in R; every time you do x <- c(x, y), R is actually creating a new object with the value c(x, y) and then having the name x point to that new object. This takes a small amount of time to do but can quickly add up to a long time in a loop.

But expand.grid() avoids creating so many objects by doing a little work up front. It figures out how many rows and columns the final result should have, then repeats each vector's values as needed to have a row for every combination.


Thank you very much. Awesome