Running permutations for large datasets

Hello,
I have large datasets in the form of lower matrices. I have to run 1000 permutations using parallel and available cores so it can run faster. The problem is running 1000 permutations is really requiring lots of power and is eating up all my memory. I increased memory to 128 RAM and it still takes too long. I am using the mgram function from the ecodist package to produce correlograms (gives me a mantel r output) after the files are permutated.

Has anyone encountered or solved this problem? How can I calculate permutations more efficiently without the increase of memory? I have already tried 48 cores and it just makes it worse. I also tried to permutate then save the file then delete it from the environment but that was a little difficult for me transforming each matrix into a dataframe .csv for 500 files then reading them back in to perform the correlogram mgram function.

library(ecodist)

mf <- function(t) {
  df <- mgram(SpLowerMatrices[[t]], CoordsLowerMatrix,
  nperm = 1000, nboot = 0, nclass = 50)$mgram

Thanks!
-Soraida
PhD Student, UIC

I find it somewhat hard to follow your post. It seems that you have attempted to wrap some parallelizing code around the mgram call you shared as code, but didnt share your parallelization code for us to look at ?

if you benchmark the timing of 1 of your mgram calls, how long does it take ? (i.e. establish that it is worth considering making extra efforts to do in parallel) if you benchmark with something like package bench it should also give some indication how much memory is consumed by the task, which can be useful also.

Hello thank you for replying and sorry about not replying sooner. I am new here.
Yes, 1 file doing 100 permutation takes about 1.5 hours. I use parallel in R and have 48 cores. When I run 10 files using 12 cores it takes about 2.5 hours, at 100 permutations. I can't go beyond that because too much memory is taken up and R environment gets full.
I tried to delete objects as they are being iterated using gc() but it just deletes all the files. I have not tried rm() in my loop. I will keep trying.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.