I have approximately 1000 16S sequenced samples generated from the PacBio platform, and I am using DADA2 workflow to infer ASVs. The script is halted at dada function step since I use the pooling option
dd <- dada(filts, err=err, pool=TRUE, BAND_SIZE=32, OMEGA_A=1e-10, DETECT_SINGLETONS=FALSE, multithread=TRUE)
I tried to use different servers with higher memory space, but it did not work. So, I realized that the issue from the limitation of RStudio memory. I tested 2 options to maximize the memory as follows:
1- use doSNOW package
library (dosnow)
# number of cores to be used
cl <- makeCluster(5)
# make cluster of cores available
2- using unix package
But none of these worked and the script is still halted at the dada step.
I am using RStudio installed on Ubuntu 20.04-operated machine.
The size of the samples to be processed is 50GB, however, my trial was on a subset of samples of size 30GB
Your help is appreciated!