Scoping/Environment struggle in R markdown

Hi all,

Within a package I am working on, I am creating optional vignettes that the user can build on demand.

I want that the building of such vignettes happens in an environment that does not have the global one as parent. So, my custom vignette building function includes something like:

vignette_env <- new.env(parent = baseenv())
rmarkdown::render(path.to.file, envir = vignette_env)

The problem when I do that is that the package loaded and the objects created inside knitr chunks are not visible (in scope) to inline calls that follow, although they are visible to other chunks.

Here is a reprex that illustrates the problem by showing that a function cannot be found after a call to library(). Sorry that it may not work out of the box on some system and that apologies for the complexity, but I have to create a package containing vignettes and functions on the fly to mimic the situation.

To avoid issues, do restart your R session when indicated to reproduce.

## define some functions
builder <- function(path.to.file) {
  vignette_env <- new.env(parent = baseenv())
  rmarkdown::render(path.to.file, envir = vignette_env)
}

makeone <- function() 1

## create temp folder
temp_path <- path.expand("~/folder_to_delete/") ## adjust path to your test
system("rm -d -r ~/folder_to_delete/") ## delete in case existing already
dir.create(temp_path) ## creation per se

## create a package on the fly containing the function
package.skeleton("testvignetteenv", path = temp_path,
                 list = c("makeone", "builder"), force = TRUE)
rm(list = c("makeone", "builder"))

## fix the package so it can be installed without crashing
man_files <- dir(paste0(temp_path, "/testvignetteenv/man"), full.names = TRUE)
file.remove(man_files)

## add an optional vignette to the package
content <- "
---
title: test
output: 
  rmarkdown::html_document
---

```{r}
library(testvignetteenv)
```

the value `r makeone()` crashes because the function is not found.

```{r}
makeone() ## works
```
"
dir.create(paste0(temp_path, "testvignetteenv/inst/extdata/"), recursive = TRUE)
cat(content, file = paste0(temp_path, "/testvignetteenv/inst/extdata/test.Rmd"))

## install the package
install.packages(paste0(temp_path, "/testvignetteenv"), type = "source", repos = NULL)


## PLEASE RESTART YOUR R SESSION HERE!!!

## build the vignette
temp_path <- path.expand("~/folder_to_delete/") ## adjust path to your test
library(testvignetteenv)
builder(paste0(temp_path, "/testvignetteenv/inst/extdata/test.Rmd"))
utils::browseURL(paste0(temp_path, "/testvignetteenv/inst/extdata/test.html"))

What happens when I do this is the following:

FYI: since this question attracted no reaction so far, I have mentioned it there:

I did not include the above reprex

Hey,

I never observed that before. Interesting !
I knew the envir was something that one should not really touch often, but I never thought of an issue like that.
Here is simpler reprex IMO.

---
title: test
output: html_document
---

```{r}
library(xfun)
```

The extension of "test.Rmd" is `r file_ext("test.Rmd")`

```{r}
file_ext("test.Rmd")
```

Put this in test.Rmd and try render using

rmarkdown::render("test.Rmd", envir = new.env(parent = baseenv()))

You'll get the error that file_ext is not found.

To complete @atusy explanation on GH, you can consult the chapter about environment is advance R
https://adv-r.hadley.nz/environments.html#search-path

new package is attach as a parent of global env. This means having the global environment has parents seems important to R behavior if you want to use packages without explicit namespacing (with ::). I don't think you can work around that.

I would like the building of such vignettes happens in an environment that does not have the global one as parent.

Can you explain a little more why you want that ? There maybe other solution if we go back to the original need.

1 Like

Thanks @cderv,

That packages are parents of the global environment is not my problem, this is actually a feature I build on.

The real context is the following: I need to pass an object from the global environment to the markdown document build by my package, but I certainly do not want the user to override some functions or other objects therein.

So I thought of solving that in a clean way by defining a new environment which would only contain my package as a parent (and thus all other loaded packages at loading time of my package as parents) and my object.

To do that I used something like:

vignette_env <-  new.env(parent = as.environment("package:mypackage")) 
assign("args.vignette", value = some_object, envir = vignette_env)
rmarkdown::render(path_complete, envir = vignette_env)

By building a new environment with my package as parent, I thought it would do what I want: all functions should be there, and the assignment should add the object to it.

But it does not work because of the scoping difference between the inline and chunk code.

To me this looks like a bug in environment handling in rmarkdown, but as you said, environment handling is always tricky so I am still hopping for an R environmentalist to step in and confirm whether this is a bug or my misunderstanding of how environments work (or both).

Of course I could inbed in the mardown chunks some code to remove all but one object from the global, but I am not looking for workaround, I want to tackle the issue properly. Also, even doing that does not garrantee that the deleted objects in chunks won't be available from the inline scope as we have seen it above.

The initial example does not show that. For me it does not work inline or chunk. The minimal example I tried to build here to simplify yours shows that (Scoping/Environment struggle in R markdown - #3 by cderv)

I can look deeper into it, but we need to build an example that shows this difference. I see above your are using a difference code (no baseenv() but as.environment("package:mypackage")) so maybe it is different... The example with the package is not an easy one.

If I use your code with my example above, it works.
Rendering the code from my previous post with:

vignette_env <-  new.env(parent = as.environment("package:xfun"))
assign("args.vignette", value = "1", envir = vignette_env)
rmarkdown::render("test.Rmd", envir = vignette_env)

To pass an object from global env to the Rmardown document, you could use Rmarkdown params feature. Have you though of that ?
Your document could be parametrized then parameters can be passed at run time inside render() call using params argument. It is a great way to pass variable to the document.

Also, if you use envir = new.env(), Rmarkdown will use a new environment to assign values, but it will be a parent of Global Env, so you can access any value.

---
title: test
output: html_document
---

This is from Global Env and equal to 

```{r}
x
```

This will assign a new value to it but not modify the one in global env

```{r}
x <- 100
```

```{r}
x
```

Try

x <- 1
rmarkdown::render("test.Rmd", envir = new.env())
x

I may have misunderstood your need.

Hi again and thanks again,

It does :slight_smile: in my first example, makeone() is found in the chunk but not in the inline code.
This scoping issue is the issue I would have liked us to focus on.

I had played with params but perhaps not in conjuction with envir and at least outside a package the following seems to work:

---
title: test
output: html_document
params:
  x: x
---

```{r}
x <- params$x
```

```{r}
x
```

`x` passed as parameter is `r x`

```{r}
x <- 100
```

`x` redefined in markdown is `r x`

which we render as:

rmarkdown::render("./Desktop/test.Rmd", params = list(x = 1), envir = new.env(parent = baseenv()))

Now the question is whether this works inside a package too... I have no time to test now, but will come back on that when I can.

Thanks again, I think we are moving forward.

As far as @cderv and I tried, makeonw() is missing in the chunk as well.

BTW, I still do not see your needs.
I guess there is no easy way to modify global environment during rendering.
If you really need this feature, I recommend rlang::env_clone.

In the following example, x is 1 before cloning, and is 2 after cloning.
As render uses cloned environment, which is child of .GlobalEnv, render first finds x from the cloned environment.
Also, as the cloned environment is the child of .GlobalEnv, you can find exported functions from loaded packages.

Render

x = 1
envir = rlang::env_clone(.GlobalEnv, .GlobalEnv)
x = 2
rmarkdown::render(
  "example.Rmd", envir = envir
)

Rmd

---
title: test
output: html_document
---


```{r}
x # is 1!!
```


```{r}
library(xfun)
```

The extension of "test.Rmd" is `r file_ext("test.Rmd")`

```{r}
file_ext("test.Rmd")
```

1 Like

IMO, using variables from (cloned) .GlobalEnv is not a good idea in terms of reproducibility.
Even if your package ensures the reproducibility, Rmd itself won't looks like reproducible.

I hope parameterized Rmd work for you.

Thanks @atusy and @cderv,

Since it seems that I have difficulties communicating my problem, I have now created a better reprex by building a small clean package, focusing on the problem and not on the context of utilization (since the issue is not getting something done, this I can manage using many possible workaround).

For this I hosted the package on GH.

Please proceed as follows (do not click on the knit button) to build the vignette detailing the issue:

remotes::install_github("courtiol/testenvrmarkdown")
library(testenvrmarkdown)
build_extravignette()

As hope you will get the same output as on my computer, but just in case I attached here a copy of the output copy_vignette.pdf (50.0 KB)

If you do replicate, as you will see, rmarkdown::render seems to not take the environment provided properly, leading to several surprising (at least to me) behaviour.

Thanks for having a look!

Question 1: why is my package attached?

Maybe you attached by yourself before calling testenvrmarkdown::build_extravignette.
run callr::r(testenvrmarkdown::build_extravignette), and you'll see following results.
Changing environment won't affect search() result.

search()
##  [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
##  [4] "package:grDevices" "package:utils"     "package:datasets" 
##  [7] "package:methods"   "Autoloads"         "tools:callr"      
## [10] "package:base"

Question 2: why is hello() not found?

Because your environment is package:stats.
search() just returns the full order of searching.
However, in your case with as.environment("package:stats"), searching starts from stats and skips testenvrmarkdown.

.GlobalEnv -> testenvrmarkdown -> rstudio -> stats -> graphics - ...

Question 3: why is hello() still not found?

As we've already described, library(testenvrmarkdown) attaches testenvrmarkdown as a parent of .GlobalEnv.
not as parent of your specifying environment.
As a result, the situation does not change from Question 2.

2 Likes

OK zillion thanks @atusy,

I now understand that I did not understand something crucial about how environments actually work with rmarkdown.

I had assumed that the rendering environment specified would define the outcome of search() in the vignette but I now understand that the environment one choose to be in for building a vignette does not affect what environments actually exist (which is actually the normal behaviour within R).

That explains everything and that allows me to solve my issue:

Instead of defining a custom environment placed earlier than the global envir, I will simply use callr::r to make sure no extra package is attached and to get a clean empty global envir that is not the one of the user has messed with.

I will then load packages in the vignette and pass the objects I need as parameters to the vignette via render(params = list(...)).

That should nail it, I will close the issue on GH.

Thanks again and thanks to @cderv too!

Glad we help you understand ! I learnt some stuff too on this one. Thank you for asking.

And thanks a lot for the last package example ! Having a good reprex is essential ! I did not find time to look into it today yet... but @atusy was on the call ! thank you @atusy !

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.