#Error in . local(connect, statement, ...)
# could not run statement: FUNCTION aaa. factor does not exist
EDIT ONE
I have found multiple answer that requiere that I create a function to apply the require SQL querry. But I have not knowledge of SQL(that's a problem I have been avoiding). After to much looking and trying I came up with this:
I'm not sure if this is a solution. The goal of using the pool package is to do not need to load the table in my memory and I think that is exactly what as.data.frame() do.
The idea behind an R database connection is to allow bringing variables and records into memory selectively, perform some operation and then doing something with the result. That something may include writing it back to the database but doing so can be unwise.
The database libraries
Open a connection
Allow the database to be queried
Close the connection
The {pool} library deals with 1 & 3. It is primarily intended for use within shiny apps and can be replaced with more direct approaches. For example, see the example in help(dbPool).
Other libraries deal with all three aspects. A script would look like
library(DBI)
library(RMariaDB)
con <- dbConnect(RMariaDB::MariaDB(),
username="the_user",
password="the_password",
dbname ="r")
dbListTables(con)
#> [1] "jgr"
dbListFields(con, "jgr")
#> [1] "recid" "id" "v1" "v2" "v3" "v4" "v5" "v6" "v7"
#> [10] "v8" "v9" "v10" "v11" "v12" "v13" "v14" "v15" "v16"
#> [19] "v17" "v18" "v19" "v20" "v21" "v22" "v23" "v24" "v25"
#> [28] "v26" "v27" "v28" "v29" "v30"
x <- dbGetQuery(con, "SELECT v4 FROM jgr WHERE id = 'NS_000'")
y <- dbGetQuery(con, "SELECT v6 FROM jgr WHERE id = 'NS_022'")
dbDisconnect(con)
dat <- cbind(x,y)
fit <- lm(v6 ~ v4, data = dat)
summary(fit)
#>
#> Call:
#> lm(formula = v6 ~ v4, data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -618.21 -228.26 -6.68 13.13 1950.54
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -3.18906 13.50270 -0.236 0.813
#> v4 0.59126 0.01733 34.125 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 391.6 on 999 degrees of freedom
#> Multiple R-squared: 0.5383, Adjusted R-squared: 0.5378
#> F-statistic: 1165 on 1 and 999 DF, p-value: < 2.2e-16
This does use two SQL statements, like SELECT v4 FROM jgr WHERE id = 'NS_000'. An alternative is provided by {dbplyr}, which relieves even that slight degree of complexity by allowing queries using {dplyr} syntax.