odbc dbConnect fails in R Markdown

I've got a real head scratcher that I think I'm struggling with because I don't know exactly how the Knit button works in R Studio. I have a db connection in my rmarkdown document. It's a dead simple odbc connection to redshift that looks like this:


con <- DBI::dbConnect(
      odbc::odbc(),
      Driver     = "Amazon Redshift",
      Database   = "my_db",
      port       = 5439,
      host       = "redshift.me.com",
      user       = "uid",
      password   = "pdw"
    )

I use this thing ALL the time. Works great.

I had a hardware issue and had to reinstall everything on my Mac and thought all was well. However the connection above works great in an R file or in the Console... or if I use rmarkdown::render('Untitled.Rmd') however, and here is where things get weird, if I click on the "Knit" button in RStudio I get the following error:

Quitting from lines 14-25 (Untitled.Rmd) 
Error: nanodbc/nanodbc.cpp:1021: 00000: [Amazon][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function: [Amazon][DSI] An error occurred while attempting to retrieve the error message for key 'LibsLoadErr' with message parameters ['"", "libodbcinst.dylib"'] and component ID 3: Message not found in file "/opt/amazon/redshift/ErrorMessages/en-US/ODBCMessages.xml" 
Execution halted

That's a bit of a red herring. I know what causes that error. That's the error I get if I had the path for my Redshift driver file wrong in my odbcinst.ini ... but I don't. So it's like the R Markdown is being somehow run differently when I click "Render" as opposed to running rmarkdown::render('Untitled.Rmd')

I added a getwd() to the R Markdown and all ways of running it seem to be executing in the same file path. And I confirmed that all methods are reading my .Rprofile file, though that should not impact this build as I have nothing in there related to database drivers. But somehow when I click "Knit" I'm having a different experience from the other ways of running the code.

Things I have checked:

  • knit directory is set to the document directory
  • tried changing knitting directory
  • I tried clearing knit cache
  • tried knitting to different format
  • moved the Rmd file to a different directory

Any ideas?

crazy-pills-will-ferrell

Hi @jdlong !

render() in Console from RStudio is working right ?
But not working when clicking the render button, right ?

In the second case, the same function should be called rmarkdown::render() but in a clean background session. However that session should be the same context environment as in console (meaning using .Rprofile and .Renviron, and using any environment configuration from your RStudio environment).

BTW, is it RStudio Workbench on Server or RStudio Desktop ?

It seems to be a obdc configuration issue. Are you using standard configuration ? Or Are you setting some user configuration, environment configuration or something else ? Or is it all system-wise configured ?

in order to debug further within the R Markdown, you could try getting lower level and see what obdc is finding.

  • odbc::odbcListDrivers() should give you the found drivers from odbcinst.ini
  • odbc::odbcListDataSources()shoud give you the data source from odbc.ini

Adding check on the environment variables or configuration that could impact the file path used for those is another idea (more low level stuff in unixODBC without the GUI)

Trying to print those different configuration value from within the Rmd document could maybe show us what could be different in both environment (background rendering vs console rendering). You could print also all the environment variable to see if there is a notable difference regarding ODBC or user access.

Hope it helps ! Good debugging !

1 Like

Christophe, great input! Thank you. I'll clarify a few things you asked about:

Behavior: RMD builds using render and code runs "block by block" in the UI, but does not build when clicking the 'Knit' button in RStudio Desktop.

Platform: RStudio Desktop on Mac.

ODBC details: Using UnixODBC on Mac. RStudio version 2022.02.3+492 and R version 4.1.3

Great idea with odbc::odbcListDrivers() and odbc::odbcListDataSources() - I created an RMD with both of those commands in a block and ran it. I get the same results running it in an interactive block or running it with the "Knit" button. I really was suspicious this was going to show me something useful.

The way I always use UnixODBC is to set up only the drivers in my .ini files and then use connection strings in my code. I want as little config as possible outside the code. So I'm having trouble imagining what's happening differently in the code execution using the different build methods.

However trying to figure out what in the world is different between the two run modes, I have discovered an oddity that I can't understand. Running sessionInfo() interactively vs. when knitting using the button, it's reporting a different MacOS version (!!!) I have no idea how this would happen:

any clue how this would happen? Just to clarify, the results above are from the same machine. The left is interactively run and the right is from a document I knitted using the "Knit" button. On the same machine.

a look through the help for sessionInfo() shows some comment about why the version might be off, but I have no idea why the version would be different between interactive vs. "Knit" button:

#whyDaddyDrinks

Wow that is odd! Do you have several R version on your computer ? On the left I don't see BLAS info for example. I don't own mac, but this seems very weird to have such different result.

I don't know if the IDE is doing something different on this info.

But do you think this could impact the ODBC configuration ?

Yeah I thought it would too. Too bad... did you compared options and env var using same comparison as with sessionInfo ?

Also sessioninfo could give more information maybe. I don't know...

It can't explain the difference in RStudio. I don't have much more idea for now.

So I did a Sys.getenv() to print environment variables for both interactive and when clicking "Knit"... there are some differences. One that jumps out at me is the DYLD_FALLBACK_LIBRARY_PATH

when clicking "Knit" I get this:

DYLD_FALLBACK_LIBRARY_PATH                        /Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server

But from the interactive session I get:

DYLD_FALLBACK_LIBRARY_PATH                        /Library/Frameworks/R.framework/Resources/lib:/Library/Frameworks/R.framework/Resources/lib:/Users/jal/lib:/usr/local/lib:/usr/lib:/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server:/Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server:/var/folders/tn/mxttln7d059f_sk74nq3_p480000gn/T/rstudio-fallback-library-path-VEUy3a
E

The interactive session has a lot more things in that path... I have no idea why

Yes this could be it. This looks like LD_LIBRARY_PATH on Linux where file like "libodbcinst.lib" should be placed.

Is libodbcinst.dylib somewhere in one of the PATH missing from knitting env ?

This seems like an issue in IDE. Probably a regression ... :thinking: Out of curiosity, did you try older IDE version ? :man_shrugging:

Or last daily or next preview in case this is fixed ?

I updated to the latest R Studio nightly build and the issue persists.

Really odd. I'll keep poking around and post if I find any clues.

What I'm puzzled by is how the R Studio RMD knitter ends up with an environment that is not exactly the same as the RStudio application. There must be something different about how the "knit" button starts R vs. how the app does. But that's rather deep RStudio internals which I'm not intimately familiar with.

Well I've learned something... but not sure what it is.

I eliminated the following by manually setting the environment variables in the RMD:

  • it's not the path
  • it's not the LD_LIBRARY_PATH

However, I wanted to make sure other ODBC connections worked, so I put in a connection to a different DB using a different driver. AND THEN THEY BOTH WORKED!!!11!1 <bang head against desk>

So after some testing I figured out that if I connected to ANY other ODBC database first, then the Amazon Redshift DB connection would work. I have 3 set up on my machine: FreeTDS for SQL Server, Dremio, and Amazon Redshift. If I do either FreeTDS or Dremio first then Amazon just connects right up. But if I do Amazon first I get the error reported above.

So I'm not sure I really understand WHY it works this way. But I certainly have a dependable work around.

And this, gentle reader, is why daddy drinks.

:exploding_head:

It could be a mystery for some time, I don't have a clue. Something in the other drivers is loading up the right thing probably that amazon can't load properly

:beers:

1 Like

Not sure if it's related or not, but here is a similar-sounding problem I was similarly frustrated by last year.

Some of my 'ruleouts' ended up being inadequately ruled-out given the unknown environment variables passed from RStudio to the odbc call when running in a shell.

My problem was solved by creating a wrapper for the RStudio Server service which passed additional environment variables - these passed to the odbc driver, even though the ones in Renviron.site didn't.

I had this problem too (after updating OS) and tried similar things (e.g. printing and comparing environment variables for interactive vs. knit) and scratched my head a bunch. I figured it was probably a path issue that I couldn't diagnose so I tried adding the full path to the lib location in (for me) simba.snowflake.ini, i.e. went from

ODBCInstLib=libiodbcinst.2.dylib to
ODBCInstLib=/usr/lib/libiodbcinst.2.dylib

Not an ideal solution, but it worked (and so far hasn't broken anything else :crossed_fingers:)

2 Likes

Also seeing knitting fail to connect, but interactive work.

In my case, it is almost certainly the result of updating macOS to Monteray (the same .Rmd knitted just before update). Unlike JD, I have the same OS version between console and kniting, though, so may be a different story. I haven't checked all the solutions offered here yet.

Error: nanodbc/nanodbc.cpp:1021: 00000: [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired  [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Failed to authenticate the user 'XXXXX' in Active Directory (Authentication option is 'ActiveDirectoryPassword').
Error code 0xA190; state 41360
Error getting realm info
  [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Could not load libcurl, make sure it is installed.  [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: Timeout error [258].   [Microsoft][ODBC Driver 17 for SQL Server]Unable to complete login process due to delay in login response 
Execution halted

Indeed, I don't have libcurl installed. But I'm not sure why that would be necessary in this case?

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.5

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] DBI_1.1.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3     rstudioapi_0.13  knitr_1.37       janitor_2.1.0    magrittr_2.0.3   hms_1.1.1        odbc_1.3.3      
 [8] tidyselect_1.1.2 bit_4.0.4        R6_2.5.1         rlang_1.0.2      fansi_1.0.3      stringr_1.4.0    blob_1.2.2      
[15] dplyr_1.0.7      tools_4.1.2      xfun_0.29        utf8_1.2.2       cli_3.3.0        ellipsis_0.3.2   readxl_1.3.1    
[22] bit64_4.0.5      assertthat_0.2.1 tibble_3.1.7     lifecycle_1.0.1  crayon_1.5.1     purrr_0.3.4      vctrs_0.4.1     
[29] snakecase_0.11.0 glue_1.6.2       stringi_1.7.6    cellranger_1.1.0 compiler_4.1.2   pillar_1.7.0     generics_0.1.1  
[36] lubridate_1.8.0  renv_0.15.2      pkgconfig_2.0.3 

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.