Hi,
I need help in parallel processing, currently the sequential loop takes a lot of time to execute as the data is huge, the function refers multiple tables from sql database and create 3 dataframe in each loop , in each sequence the data get appended in the 3 dataframe. is it possible to do parallel processing ?
Here is the sample code
library(doParallel)
library(foreach)
date <- read.csv("date_split_sample.csv")
head(date)
start_date end_date
01-Jan-2021 30-jan-2021
01-Feb-2021 28-Feb-2021
........
numcores <- detectcores()
cl <- makecluster(numcores,type="PSOCK")
registerDoParallel(cl)
foreach (i=1:nrow(date),.combine='rbind') %dopar%{
a <- db.q("select * from abcd where date between start_date[i] and end_date[i] )
b <- db.q("select * from efgh where date between start_date[i] and end_date[i] )
c <- db.q("select * from ijkl where date between start_date[i] and end_date[i] )
#some process using above dataframe and 3 dataframe created in each loop
output1
output2
output3
}
i want these 3 datafame in such a way that in each loop data gets appended
I donot know how to create multiple dataframe & append the records of each loop.