I am trying to distribute my MCS simulation on the university computer cluster. I am using r-tensorflow and snowfall.
When I distribute across the cores of a single node with the following code:
sfInit(parallel = TRUE, cpus = 15)
everything works fine.
But if I try to distribute across multiple nodes with the following code:
pbsnodefile = Sys.getenv("PBS_NODEFILE")
machines <- scan(pbsnodefile, what="")
nmach = length(machines)
sfInit(parallel=TRUE,type='SOCK',cpus=nmach,socketHosts=machines)
Snowfall fails to load the keras packages (tensorflow too), but it works fine with any other "normal" package. In this case I receive the following error:
Error in sfLibrary(keras) :
Stop: error loading library on slave(s): keras
Thank you for your help!!
Best
Tullio