Hello RStudio Community,
Aim/ Desired Behavior
I am creating my first SQL database and am very new to SQL. The input is many CSVs that will be appended to one another within an SQL table. I would like to create/append to the SQL database/tables via R, since I will have to append additional CSVs at later timepoints. I would like to then be able to use the database with dplyr
verbs (https://db.rstudio.com/dplyr/).
Question 1: DBI
or dplyr
/dbplyr
I see I can create the database with either DBI
or dplyr
/dbplyr
and am trying to figure out which is preferable. I wasn't able to find a recommendation on https://db.rstudio.com/. I did find a comment from Hadley Wickam suggesting to use DBI, but this is from 2017 and may be outdated: https://github.com/tidyverse/dplyr/issues/3120#issuecomment-339034612
Any suggestions or links to resources are much appreciated!
I poked around with both packages and am including the reprex. (Note: The reprex uses a SQLite database; however, I will use PostGreSQL.)
Question 2: "write" vs. "create" and "insert/append"
DBI
or dplyr
/dbplyr
documentation differ in their recommendations for these methods, plus I couldn't get some of the functions to work. What is the current recommended practice?
DBI
New code should prefer dbCreateTable() and dbAppendTable(). (https://dbi.r-dbi.org/reference/dbwritetable)
dplyr
db_create_table() and db_insert_into() have been deprecated in favour of db_write_table(). (https://dplyr.tidyverse.org/reference/backend_dbplyr.html)
Thank you for your thoughts and advice!
library(DBI)
library(dbplyr)
library(dplyr)
# Create db with DBI ------------------------------------------------------
con_dbi <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
# Add table to database
DBI::dbWriteTable(con_dbi, name = "mtcars1", value = mtcars)
# Doesn't work: empty table
# DBI::dbCreateTable(con_dbi, name = "mtcars2", fields = mtcars)
# Add rows to table in database
DBI::dbWriteTable(con_dbi, name = "mtcars1", value = mtcars, append = TRUE)
DBI::dbAppendTable(con_dbi, name = "mtcars1", value = mtcars)
#> [1] 32
# Create db with dplyr ----------------------------------------------------
con_dplyr <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
# Add table to database
dplyr::copy_to(con_dplyr, mtcars, "mtcars1", temporary = FALSE)
dplyr::db_write_table(con_dplyr, table = "mtcars2", types = db_data_type(con_dplyr, mtcars), values = mtcars)
#> [1] "mtcars2"
# Doesn't work: empty table
# dplyr::db_create_table(con_dplyr, table = "mtcars3", types = db_data_type(con_dplyr, mtcars))
# Add rows to table in database
dplyr::db_insert_into(con_dplyr, table = "mtcars2", values = mtcars)
# Doesn't work: https://github.com/tidyverse/dplyr/issues/3120
# dplyr::db_write_table(con_dplyr, table = "mtcars2", types = db_data_type(con_dplyr, mtcars), values = mtcars, append = TRUE)
# Explore table -----------------------------------------------------------
db <- con_dbi
tab <- "mtcars1"
mtcars_db <- tbl(db, tab)
mtcars_query <- mtcars_db %>% select(cyl)
mtcars_query %>% show_query()
#> <SQL>
#> SELECT `cyl`
#> FROM `mtcars1`
mtcars_query %>% collect()
#> # A tibble: 96 x 1
#> cyl
#> <dbl>
#> 1 6
#> 2 6
#> 3 4
#> 4 6
#> 5 8
#> 6 6
#> 7 8
#> 8 4
#> 9 4
#> 10 6
#> # … with 86 more rows
Created on 2020-10-29 by the reprex package (v0.3.0)