this is my first try to use dbplyr for connect my data server.
dbplyr doesn't just translate dplyr syntax into SQL, does it?
Until now, I have been using RODBC to get data by directly send SQL.
However, the SQL scripts were long and I was troubled by the line feed symbols on the R script (because I always use paste0()).
Also, I am better at R statements than SQL statements, so I am looking forward to dbplyr.
I would like to have the SQL statement processed on the database side without using the temporary memory of mypc.
And I was able to create up to the following code.
(my data source is DB2.)
library(dbplyr)
library(dplyr)
library(odbc)
library(DBI)
con <- dbConnect(odbc::odbc(), "my_odbc_name", UID = "name", PWD = "pass")
tbl(con,
in_schema("LIB_1", "user_data"))%>%
select(YMD,value,user) %>%
group_by(user) %>%
summarise(cal_sum = sum(value)) %>%
mutate(RN = row_number())
return
# Source: lazy query [?? x 3]
# Database: DB2...........
user cal_sum RN
<chr> <dbl> <int64>
1 "all_goodman" 114 1
2 "Walter_White" 723 2
First question
Does this code translate the code written in R into SQL and process it on the database side?
Does it do what I wanted it to do?
I hate follows.
Don't want to use PC memory.
tbl2 <- dbGetQuery(con, 'SELECT * FROM LIB_1.user_data')
tbl2 %>% .....
and this.
library(RODBC)
conn_DB <- odbcConnect("my_odbc_name", "name", "pass")
query <- paste0("select * from LIB_1.user_data where user = 'all_goodman' limit 5 ")
tbl <- sqlQuery(conn_DB, query)
tbl %>% ...
Second question
Is there any other ideal way to execute this?
Third question.
What is the meaning and origin of the name "lazy"?
Does it have anything to do with tidy or messy?
thank you !