Hi, I have this question
Does there exist any way to apply a "Split-Apply-Combine" (divide and conquer algorithm) method to a dbplyr object class? I found the Replyr package but it method wasn't work.
Currently I am connected with Rpostgres library to a Postgresql data base. So, in this way I am querying my data hosted in a postgres database, now just I need to apply the "Split-Apply-Combine" method (Divide and Conquer Algorithm) to get some results.
I am working in this way
fun_connect<-function(){dbConnect(RPostgres::Postgres(),dbname = 'database',
host = 'localhost', # i.e. 'ec2-54-83-201-96.compute-1.amazonaws.com'
port = 5432, # or any other port specified by your DBA
user = 'postgres',
password = 'secretPassword',
options="-c search_path=schema")}
#Activo funciĆ³n
conn <- fun_connect()
#Conecto y llamo a BBDD "censo"
dbtable<-tbl(conn, "table")
Replyr approach: dont'work
dbtable %>% replyr::replyr_split('columSplit') %>%
lapply(funToApply) %>%
replyr::replyr_bind_rows()
I am searching something like the local "Split-Apply-Combine" approach (this is just a prototype example):
library(tidyverse)
dfClassic<-data.frame(a=c(123,234,345,65,76,678,567,43,234),
b=c("a","b","a","b","c","d","c","c",a"))
split.data.frame(dfClassic, dfClassic$b)%>%
map(.,~data.frame(sum(a)))%>%
bind_rows()
In the dbplyr github issues, the people recommend me use do() function, I tried the next:
dbtable%>% group_by("anycolumn") %>% do(mean(as.numeric("columnToApplyAFunction"))) %>% collect()
but this return me:
|======================================================================================= |100% ~0 s remaining
Error: No more ticks
Run `rlang::last_error()` to see where the error occurred.
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Thanks for you help.
Regards!