Spatial Kernel parallelization on virtual machine : is it working or not in parallel? ERROR doube free or corruption

Hi,
I am trying to parallelize a spatial kernel on a virtual machine (windows) with 160 cores, out of which I am trying to use 6.
The running is extremely slow and I am not sure whether the parallelization works .
Unfortunately I was not able to add a progress bar or to check that the number of cores used is actually 6.
If I run on the terminal the top command I can see that the CPU is 100% but I want to have a more reliable insight.

The code is fine because I tried it without the kernel function (hotspot_kde) and it returns an expected list of sf objects but I am not sure how to verify that is really working on parallel.

The long running is due to the kernel .. I really need to speed this code up,
could you help me understand what I am doing wrong? How can I properly speed it up?

As it was taking too much I stopped the function KernelList and then call the stopCluster and it returned the message "double free or corruption" which makes me think that something is wrong in the parallel processing.

I also re-tried running the code dividing the dataset into small pieces and compute the calculation for each small piece in order to improve the memory allocation .. it was fine when I run it over the weekend .. it kept working till yesterday morning but I faced the "double free or corruption (out) aborted" error again.
Thank you for your help!!

library("ggplot2") # Plot
library("sf") # Work with spatial data
library("sfhotspot") # Kernel
library("doParallel") # Parallelization 
library("foreach") # Parallelization 

# Kernel density ----

# Optimal bandwidth
opt_bw <-2850

# Split by continent 

df_split<-df_outbreaks |> 
  dplyr::group_split(continent) |> 
  setNames(sort(unique(df_outbreaks$continent))) 

# Parallel processing ----

# Detect number of cores

detectCores()

# Make the clusters and register them

cluster = makeCluster(6) 
registerDoParallel(cluster)

# Set up functions to be used in the parallelisation 

KernelList<-foreach(i=1:length(df_split),
                   .packages=c("sf","sfhotspot")) %dopar% {
                  
                  # Transform each dataframe into sf
                   sf<-st_as_sf(df_split[[i]], coords = c("x","y"), 
                            crs="+proj=moll +lon_0=0 +x_0=0 +y_0=0")
                   
                   # Function to create a Kernel density map for each sf object
                   
                   hotspot_kde(sf,cell_size = 1000, bandwidth = opt_bw)
                   

}

# Stop Cluster

stopCluster(cluster)



image

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.