My first attempt to use dbplyr and odbc.

this is my first try to use dbplyr for connect my data server.

dbplyr doesn't just translate dplyr syntax into SQL, does it?

Until now, I have been using RODBC to get data by directly send SQL.
However, the SQL scripts were long and I was troubled by the line feed symbols on the R script (because I always use paste0()).

Also, I am better at R statements than SQL statements, so I am looking forward to dbplyr.

I would like to have the SQL statement processed on the database side without using the temporary memory of mypc.

And I was able to create up to the following code.

(my data source is DB2.)

library(dbplyr)
library(dplyr)
library(odbc)
library(DBI)

con <- dbConnect(odbc::odbc(), "my_odbc_name", UID = "name", PWD = "pass")

tbl(con, 
    in_schema("LIB_1", "user_data"))%>% 
  select(YMD,value,user) %>% 
  group_by(user) %>% 
  summarise(cal_sum = sum(value)) %>% 
  mutate(RN = row_number())

return


# Source:   lazy query [?? x 3]
# Database: DB2...........
user                cal_sum      RN
<chr>           <dbl> <int64>
1 "all_goodman"   114       1
2 "Walter_White"   723       2

First question

Does this code translate the code written in R into SQL and process it on the database side?
Does it do what I wanted it to do?

I hate follows.
Don't want to use PC memory.

tbl2 <- dbGetQuery(con, 'SELECT * FROM LIB_1.user_data')
tbl2 %>% .....

and this.

library(RODBC)

conn_DB <- odbcConnect("my_odbc_name", "name", "pass")

query <- paste0("select * from LIB_1.user_data where user = 'all_goodman' limit 5 ")
tbl <- sqlQuery(conn_DB, query)
tbl %>% ...

Second question

Is there any other ideal way to execute this?

Third question.

What is the meaning and origin of the name "lazy"?
Does it have anything to do with tidy or messy?

thank you !

  1. The R code is just translated to SQL and executed on the database. You can check the translation via show_query().
    A dplyr backend for databases • dbplyr (tidyverse.org)

  2. I don't know what you mean. What's not ideal about your current solution?

  3. Lazy evaluation means that the expression is not evaluated until its value required.

2 Likes

To Echo on Martin's answer -

  1. Yes, dbplyr does go from R -> SQL, and you can inspect that code.

  2. You could just write SQL, either as you show with paste0 or even in a .rmd chunk that is set to SQL. I would often do this as the first chunk of a script. .rmd files support lots of languages, which is another reason why they are awesome.

  3. In the RStudio community "tidy" is usually used to talk about a tidy data frame, which is outlined in the free R For Data Science book. Basically, a data frame where columns are variables, rows are observations, and one thing is in a cell.

1 Like

@martin.R

I was relieved to know that it was being handled on the database side.
Now that I understand the behavior, I can understand the package more accurately.

Thank you very much.

@pditty

I never know .rmd chank can handling SQL.
I will serch info and try it!
(if you have any recommend website?)

thank you!

@Rsky-

One awesome thing about the RStudio community is that there are usually great resources online! Check out the full book on markdown here. See section 2.7, "Other Language Engines."

1 Like

@Rsky, this video/slides tutorial goes in-depth into the subject topic.

Courtesy of @irene.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.