"Pull" .SQL file into R script

Hi all,

I have been working on a R script that uses several SQL queries. Since the SQL code is also used across different product such as Tableau inside our team, we decided to move them all to a central repo that can do code review (Gitfarm) for easier maintenance.

I was wondering if there is a way that I can do something like “git pull” to get SQL code from Gitfarm to R script, or is there any other common approach to solve such problem? Thank you!

I can't find any info on GitFarm but if it is a git remote you should be able to make a git pull.
Otherwise, you should be able to download file using their url path like with github or gitlab.

Hi cderv, thanks for replying. GitFarm is like the corporate github we used for code review and version control.

So how do you use git pull in R script? I know there is R console where you can do all the Git command but I’m not familiar with it, and I also don’t know how it works inside a R script. It’s just weird because I’m trying to git pull a .sql file inside a .R file, and the code inside .sql will eventually be saved as a string in R so that i can use packages like DBI to connect and query a database.

Do you have any tutorials/examples about this? Thank you!

For using git from R, you can use git2r :package:

in your case, I think you could manage to find the url of your sql file on your gitfarm server, and then just readLines the file from the url path to get back the sql character chain you want.
Or just download the file locally without cloning the repository, i.e without any git knowledge.

example with a csv

# read from url
url <- "https://raw.githubusercontent.com/tidyverse/readr/master/inst/extdata/mtcars.csv"
read.csv(url)
#>     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#> 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
#> 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> 7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
#> 8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#> 9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
#> 10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> 11 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> 12 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
#> 13 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
#> 14 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
#> 15 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
#> 16 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
#> 17 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
#> 18 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
#> 19 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
#> 20 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#> 21 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
#> 22 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
#> 23 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
#> 24 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
#> 25 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
#> 26 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
#> 27 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
#> 28 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
#> 29 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
#> 30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
#> 31 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
#> 32 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

# download locally
temp <- tempfile(fileext = ".csv")
download.file(url, destfile = temp)
file.exists(temp)
#> [1] TRUE
readLines(temp)
#>  [1] "\"mpg\",\"cyl\",\"disp\",\"hp\",\"drat\",\"wt\",\"qsec\",\"vs\",\"am\",\"gear\",\"carb\""
#>  [2] "21,6,160,110,3.9,2.62,16.46,0,1,4,4"                                                     
#>  [3] "21,6,160,110,3.9,2.875,17.02,0,1,4,4"                                                    
#>  [4] "22.8,4,108,93,3.85,2.32,18.61,1,1,4,1"                                                   
#>  [5] "21.4,6,258,110,3.08,3.215,19.44,1,0,3,1"                                                 
#>  [6] "18.7,8,360,175,3.15,3.44,17.02,0,0,3,2"                                                  
#>  [7] "18.1,6,225,105,2.76,3.46,20.22,1,0,3,1"                                                  
#>  [8] "14.3,8,360,245,3.21,3.57,15.84,0,0,3,4"                                                  
#>  [9] "24.4,4,146.7,62,3.69,3.19,20,1,0,4,2"                                                    
#> [10] "22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2"                                                  
#> [11] "19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4"                                                 
#> [12] "17.8,6,167.6,123,3.92,3.44,18.9,1,0,4,4"                                                 
#> [13] "16.4,8,275.8,180,3.07,4.07,17.4,0,0,3,3"                                                 
#> [14] "17.3,8,275.8,180,3.07,3.73,17.6,0,0,3,3"                                                 
#> [15] "15.2,8,275.8,180,3.07,3.78,18,0,0,3,3"                                                   
#> [16] "10.4,8,472,205,2.93,5.25,17.98,0,0,3,4"                                                  
#> [17] "10.4,8,460,215,3,5.424,17.82,0,0,3,4"                                                    
#> [18] "14.7,8,440,230,3.23,5.345,17.42,0,0,3,4"                                                 
#> [19] "32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1"                                                   
#> [20] "30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2"                                                 
#> [21] "33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1"                                                  
#> [22] "21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1"                                                 
#> [23] "15.5,8,318,150,2.76,3.52,16.87,0,0,3,2"                                                  
#> [24] "15.2,8,304,150,3.15,3.435,17.3,0,0,3,2"                                                  
#> [25] "13.3,8,350,245,3.73,3.84,15.41,0,0,3,4"                                                  
#> [26] "19.2,8,400,175,3.08,3.845,17.05,0,0,3,2"                                                 
#> [27] "27.3,4,79,66,4.08,1.935,18.9,1,1,4,1"                                                    
#> [28] "26,4,120.3,91,4.43,2.14,16.7,0,1,5,2"                                                    
#> [29] "30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2"                                                 
#> [30] "15.8,8,351,264,4.22,3.17,14.5,0,1,5,4"                                                   
#> [31] "19.7,6,145,175,3.62,2.77,15.5,0,1,5,6"                                                   
#> [32] "15,8,301,335,3.54,3.57,14.6,0,1,5,8"                                                     
#> [33] "21.4,4,121,109,4.11,2.78,18.6,1,1,4,2"

Created on 2018-07-21 by the reprex package (v0.2.0).

You can do something like that with any file on a web server. So if a sql text file, you can read from the server ( or download) and then create the SQL string to query.

2 Likes

You probably cant "pull" the sql code into an R script (well you could, but that would be really awkward). This approach should work for you (depending on your exact needs)

  • Download the sql file (download.file() or something from git2r which I am not familiar with) to a temporary directory (see tempdir(), tempfile() or wherever you want to save it. You can also use system() or system2() to execute arbitrary system commands from R (such as git pull).
  • Read it (with readLines()) and paste it to a single string with paste(..., collapse = "\n")
  • Pass it to whatever package you use to send SQL to the database
2 Likes

@hoelk

Thank you for replying, I solved this problem by the inspiration from you!

Firstly I ran something like below inside the local git directory to get the latest version of .sql from gitfarm

system('git pull')

Now that I have the latest version of .sql (let's call it tt.sql) inside this folder, I used either two of below to read the tt.sql into a str, and then I am good to go

fileName <- paste0(dir_sql_folder,'tt.sql')
a=readChar(fileName, file.info(fileName)$size)
fileName <- paste0(dir_sql_folder,'tt.sql')
a=system(paste('cat',fileName),intern = T)

Thanks for your help one more time, and if you have time, there is another consecutive problem that I encountered related to regex saved in str. Have a good day!

unrecognized escape character

Thank you for replying, I solved this problem by the inspiration from you!

Firstly I ran something like below inside the local git directory to get the latest version of .sql from gitfarm

system('git pull')

Now that I have the latest version of .sql (let's call it tt.sql) inside this folder, I used either two of below to read the tt.sql into a str, and then I am good to go

fileName <- paste0(dir_sql_folder,'tt.sql')
a=readChar(fileName, file.info(fileName)$size)
fileName <- paste0(dir_sql_folder,'tt.sql')
a=system(paste('cat',fileName),intern = T)

Thanks for your help one more time, and if you have time, there is another consecutive problem that I encountered related to regex saved in str. Have a good day!

unrecognized escape character

1 Like