I am trying to run a background job and it keeps failing but when I run the same script it works in the console. Does anyone have tips for troubleshooting this issue?
My script is roughly set up this way:
# Load libraries
library(glue)
library(tidyverse)
library(ncdf4)
library(multidplyr)
# Load functions written by our team
source("~/repos/inequality/R_scripts/load_utils.R")
# set universal inputs
DB = "/some/file/path/to/input"
BASE = "folder_with_data"
OUT="/some/file/path/to/output"
# generate all scenarios I want to run
scenario_to_run = expand_grid(var_1 = c(1,2,3),
var_2 = c('a','b','c'),
var_3 = c(TRUE, FALSE))
# run function across all scenarios and produce log
run_log = pmap(scenario_to_run, safely(deciles_plot)) %>%
transpose()
The error I see in my run_log
from the background job is object 'BASE' not found
but I can also see that the objects DB
, BASE
, OUT
are all saved as values in the background job's environment. With in the deciles_plot
function DB
, BASE
, and OUT
are the default input for some of the variables. DB
and BASE
specifically are used to generate another directory path that is fed to a separate function. If the issue was background jobs getting confused by having an object as a default value for an input into a function then why doesn't it fail on DB
which is called before BASE
?
deciles_plot = function(
var_1,
var_2,
var_3,
input = DB,
sector_basename = BASE,
output = OUT){
dir <- glue('{input}/{sector_basename}')
data = pull_raw_data_function(dir = dir, ...)
...}
As I mentioned before if I clear my environment and then run this script in the console it works just find so it's clearly something about background jobs that is causing an issue. The issue clearly isn't with generating and saving the BASE
object since I can see it in the background job's environment. And it seems inconsistent about using these objects as input because it doesn't fail when attempting to use DB
but does fail when attempting to use BASE
.
Also I know I could potentially solve this issue by not having BASE
be the default for sector_basename
and just inputting the values assigned to BASE
directly into my function. But this is just the way my team writes and uses functions and I want to stay consistent. Especially since the inconsistent behavior of background jobs implies there is a bug somewhere.