Unable to install package because of self-signed certificate with pak, but works with devtools

Hi,

I created an internal package, with the code hosted on an internal gitlab instance using a self signed certificate.

I can install the package with devtools using devtools::install_git("https://<host>/<user>/<project>.git/").
I am trying to install the package with pak (with pak::pkg_install("git::https://<host>/<user>/<project>.git/")) so I can put this package in the remotes dependencies for other project using their DESCRIPTION files and respective lockfiles but then I get an error caused by the self signed certificate.

#  note: I obfuscated the host and package names
r$> .Last.error
<callr_error/rlib_error_3_0/rlib_error/error>
Error: 
! error in pak subprocess
Caused by error: 
! Could not solve package dependencies:
* git::https://<host>/<user>/<project>.git/: ! pkgdepends resolution error for git::https://<host>/<user>/<project>.git/.
Caused by error: 
! Failed to download DESCRIPTION from git repo at <https://<host>/<user>/<project>.git/>.
Caused by error in `(function (e) …`:
! SSL certificate problem: self signed certificate
---
Backtrace:
1. pak::pkg_install("git::https://<host>/<user>/<project>.git/")
2. pak:::remote(function(...) get("pkg_install_make_plan", asNamespace("pak"))(...), …
3. err$throw(res$error)
---
Subprocess backtrace:
1. base::withCallingHandlers(cli_message = function(msg) { …
2. get("pkg_install_make_plan", asNamespace("pak"))(...)
3. prop$stop_for_solution_error()
4. private$plan$stop_for_solve_error()
5. pkgdepends:::pkgplan_stop_for_solve_error(self, private)
6. base::throw(new_error("Could not solve package dependencies:\n", msg, …
7. | base::signalCondition(cond)
8. global (function (e) …

I tred to dig into the code of pak and of pkgdepends to see what I could do, without success. And I am a bit at a loss seeing that I can install it using devtools. I don't want to have to use pak and then devtools for our internal packages.

Any help or advice would be appreciated.

Thank you

You can change the default certificate file for libcurl, by adding this to your ~/.Rprofile file:

options(async_http_cainfo = "<path to certs>")

Unfortunatelly this is not working. I had different results trying to using either pak, pkgdepends, pkgcache, the R package curl and curl itself.

It is pretty difficult to write a reprex for this kind of issue but I tried. The codebase is quite complex to navigate so I can't pin point where something happens between pak and pkgdepends to have different behaviours.

library(testthat)
library(withr)
library(processx)
library(glue)
library(fs)
library(rlang)
library(pak)
library(pkgdepends)
library(pkgcache)
library(curl)
# actual value ommited
repository_url <- "https://<host>/<user>/<project>"

curl

test_that("without the correct CA bundle curl fails with a self-signed certificate error", {
  expect_error({
    tmp <- run(
      "curl",
      c(
        "--request", "HEAD",
        "--fail",
        "--silent",
        "--show-error",
        "--location",
        "--url", repository_url
      ),
      echo = TRUE,
      env = c(CURL_CA_BUNDLE = path_wd("ca-certificates_minus_self-signed.crt"))
    )
  })
})
#> curl: (60) SSL certificate problem: self-signed certificate
#> More details here: https://curl.se/docs/sslcerts.html
#> 
#> curl failed to verify the legitimacy of the server and therefore could not
#> establish a secure connection to it. To learn more about this situation and
#> how to fix it, please visit the web page mentioned above.
#> Test passed 🎊

test_that("with the correct CA bundle curl succeeds with a self-signed certificate error", {
  expect_no_error({
    tmp <- run(
      "curl",
      c(
        "--request", "HEAD",
        "--fail",
        "--silent",
        "--show-error",
        "--location",
        "--url", repository_url
      ),
      echo = TRUE,
      env = c(CURL_CA_BUNDLE = "/etc/ssl/certs/ca-certificates.crt")
    )
  })
})
#> Test passed 🎊

proving we can control which CA bundle curl should use
and that the installed one (/etc/ssl/certs/ca-certificates.crt) works.

curl (R package)

test_that("without the correct CA bundle the R package curl fails with a self-signed certificate error", {
  expect_error({
    with_envvar(
      new = c("CURL_CA_BUNDLE" = path_wd("ca-certificates_minus_self-signed.crt")),
      {
        handle <- new_handle()
        handle_setopt(handle, customrequest = "HEAD")
        tmp <- curl_fetch_memory(
          repository_url,
          handle = handle
        )
      }
    )
  })
})
#> ── Failure: without the correct CA bundle the R package curl fails with a self-signed certificate error ──
#> `{ ... }` did not throw an error.
#> Error:
#> ! Test failed

it doesn’t fail, so we didn’t change the CA bundle to use

test_that("with the correct CA bundle the R package curl succeeds with a self-signed certificate error", {
  expect_no_error({
    with_envvar(
      new = c("CURL_CA_BUNDLE" = "/etc/ssl/certs/ca-certificates.crt"),
      {
        handle <- new_handle()
        handle_setopt(handle, customrequest = "HEAD")
        tmp <- curl_fetch_memory(
          repository_url,
          handle = handle
        )
      }
    )
  })
})
#> Test passed 🎊

It succeeds, but it is a coincidence,
because the default location for the used CA bundle works,
not because we directed curl to use the correct bundle.

pkgdepends

test_that("with the default CA bundle pkgdepends works with self-signed certificate", {
  solution <- expect_no_error({
    prop <- new_pkg_deps(glue("git::{repository_url}.git"))$solve()
    prop$get_solution()
  })
  expect_equal(solution$status, "OK")
  expect_equal(solution$problem$total, 198)
})
#> Test passed 😸

test_that("with the correct CA bundle pkgdepends works with self-signed certificate", {
  with_options(
    new = list(async_http_cainfo = "/etc/ssl/certs/ca-certificates.crt"),
    {
      solution <- expect_no_error({
        prop <- new_pkg_deps(glue("git::{repository_url}.git"))$solve()
        prop$get_solution()
      })
      expect_equal(solution$status, "OK")
      expect_equal(solution$problem$total, 198)
    }
  )
})
#> Test passed 🎊



test_that("with the wrong CA bundle pkgdepends fails with self-signed certificate", {
  with_options(
    new = list(async_http_cainfo = path_wd("ca-certificates_minus_self-signed.crt")),
    {
      solution <- expect_no_error({
        prop <- new_pkg_deps(glue("git::{repository_url}.git"))$solve()
        prop$get_solution()
      })
      expect_equal(solution$status, "FAILED")
      expect_equal(solution$problem$total, 2)
    }
  )
})
#> ── Failure: with the wrong CA bundle pkgdepends fails with self-signed certificate ──
#> solution$status not equal to "FAILED".
#> 1/1 mismatches
#> x[1]: "OK"
#> y[1]: "FAILED"
#> Backtrace:
#>     ▆
#>  1. ├─rlang::with_options(...)
#>  2. └─testthat::expect_equal(solution$status, "FAILED")
#> 
#> ── Failure: with the wrong CA bundle pkgdepends fails with self-signed certificate ──
#> solution$problem$total not equal to 2.
#> 1/1 mismatches
#> [1] 198 - 2 == 196
#> Backtrace:
#>     ▆
#>  1. ├─rlang::with_options(...)
#>  2. └─testthat::expect_equal(solution$problem$total, 2)
#> Error:
#> ! Test failed

so we can control the CA bundle using the R option async_http_cainfo for pkgdepends.

pak

Default use case

test_that("with the default CA bundle pak succeeds to install package", {
  with_options(
    new = list(),
    {
      # hack to hide full backtrace with actual URLs and IP addresses
      result <- catch_cnd(
        {
          pkg_install(glue("git::{repository_url}.git"))
        },
        classes = c("error")
      )
      expect_null(result)
      expect_equal(
        result$message,
        expected = ""
      )
    }
  )
})
#> ── Failure: with the default CA bundle pak succeeds to install package ─────────
#> `result` is not NULL
#> 
#> `actual` is an S3 object of class <callr_status_error/callr_error/rlib_error_3_0/rlib_error/error/condition>, a list
#> `expected` is NULL
#> Backtrace:
#>     ▆
#>  1. ├─rlang::with_options(...)
#>  2. └─testthat::expect_null(result)
#> 
#> ── Failure: with the default CA bundle pak succeeds to install package ─────────
#> result$message not equal to "".
#> 1/1 mismatches
#> x[1]: "error in pak subprocess"
#> y[1]: ""
#> Backtrace:
#>     ▆
#>  1. ├─rlang::with_options(...)
#>  2. └─testthat::expect_equal(result$message, expected = "")
#> Error:
#> ! Test failed

unlike pkgdepends, here pak fails.

Using the R option async_http_cainfo

test_that("using the option async_http_cainfo to point pak to use the correct CA bundle works to install package", {
  with_options(
    new = list(async_http_cainfo = "/etc/ssl/certs/ca-certificates.crt"),
    {
      # hack to hide full backtrace with actual URLs and IP addresses
      result <- catch_cnd(
        {
          pkg_install(glue("git::{repository_url}.git"))
        },
        classes = c("error")
      )
      expect_null(result)
      expect_equal(
        result$message,
        expected = ""
      )
    }
  )
})
#> ── Failure: using the option async_http_cainfo to point pak to use the correct CA bundle works to install package ──
#> `result` is not NULL
#> 
#> `actual` is an S3 object of class <callr_status_error/callr_error/rlib_error_3_0/rlib_error/error/condition>, a list
#> `expected` is NULL
#> Backtrace:
#>     ▆
#>  1. ├─rlang::with_options(...)
#>  2. └─testthat::expect_null(result)
#> 
#> ── Failure: using the option async_http_cainfo to point pak to use the correct CA bundle works to install package ──
#> result$message not equal to "".
#> 1/1 mismatches
#> x[1]: "error in pak subprocess"
#> y[1]: ""
#> Backtrace:
#>     ▆
#>  1. ├─rlang::with_options(...)
#>  2. └─testthat::expect_equal(result$message, expected = "")
#> Error:
#> ! Test failed

Using the environment variable CURL_CA_BUNDLE

test_that("using CURL_CA_BUNDLE to point to the correct CA bundle pak works to install package", {
  with_envvar(
    new = c("CURL_CA_BUNDLE" = "/etc/ssl/certs/ca-certificates.crt"),
    {
      # hack to hide full backtrace with actual URLs and IP addresses
      result <- catch_cnd(
        {
          pkg_install(glue("git::{repository_url}.git"))
        },
        classes = c("error")
      )
      expect_null(result)
      expect_equal(
        result$message,
        expected = ""
      )
    }
  )
})
#> ── Failure: using CURL_CA_BUNDLE to point to the correct CA bundle pak works to install package ──
#> `result` is not NULL
#> 
#> `actual` is an S3 object of class <callr_status_error/callr_error/rlib_error_3_0/rlib_error/error/condition>, a list
#> `expected` is NULL
#> Backtrace:
#>     ▆
#>  1. ├─withr::with_envvar(...)
#>  2. │ └─base::force(code)
#>  3. └─testthat::expect_null(result)
#> 
#> ── Failure: using CURL_CA_BUNDLE to point to the correct CA bundle pak works to install package ──
#> result$message not equal to "".
#> 1/1 mismatches
#> x[1]: "error in pak subprocess"
#> y[1]: ""
#> Backtrace:
#>     ▆
#>  1. ├─withr::with_envvar(...)
#>  2. │ └─base::force(code)
#>  3. └─testthat::expect_equal(result$message, expected = "")
#> Error:
#> ! Test failed

That's a good investigation! However, you didn't try what I suggested above, i.e. setting

options(async_http_cainfo = "<path to certs>")

in your ~/.Rprofile file .

The reason for the differences, is that our Linux pak builds embed a certificate bundle, so that pak can be used on minimal Docker containers, that do not have any certificates installed.

If the async_http_cainfo option is set, then pak will not use its embedded cert bundle. However, this option must be set in the subprocess that pak starts up, hence you need to set it in the .Rprofile. (This is not great, and we should improve it , and the option should be picked up from the main R process that the user interacts with.)

So, if it is OK for you to set the option in the user's profile, that'll probably work.

Oh! So by trying to be clean while testing I made it not work... Instinctively I thought setting the option manually would work better, avoiding side effects and such.

How does this work? There is a new R subprocess created so the manually set options are lost but the new subprocess reads the ./Rprofile file?

All good now. I am not a big fan of putting things in .Rprofile or .Renviron but I will make it work.

Thank you

Yes, sorry, I could have been more clear.

Yes, it works exactly as you say.

We'll improve this. pak already copies many options from the main process to its subprocess, e.g. all the configuration options listed at Environment variables and options that modify the default behavior — pak configuration • pak

We'll make sure that async_http_cainfo is copied as well. Or, even better, there is work to eliminate the need for running a subprocess entirely.

Personally I am always uneasy using options with R. They are rarelly well documented, AFAIK there is clear way to document them with roxygen, and to mark which functions would be affected by these options, etc. But passing configuration objects is quite cumbersome sometimes. I wish there was a cleaner way with R to handle this problem (not limited to pak)

I agree. Maybe roxygen could have an @option tag (and also @envvar?), and then it would generate a <package>-config manual page?

FWIW we are trying to do better in pak with the pak-config manual page, that has all options and env vars.

Well, except this particular one, because it wasn't considered to be part of the API until now.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.