Looking for some clarity on some behaviour related to data in R packages that I don't understand.
I have written some tests for an R package I'm developing. The tests rely on some data that travels with the package. This data has been ingested into the package using usethis::use_data()
such that lazydata: true in description, and the corresponding .rda files are present under data/
From my understanding lazy data loading means that when my package is loaded (e.g. via library()
or devtools::load_all()
), I should be able to access my data when I call it. However, what I'm finding is that my data needs to be interacted with, before I can use it. Let me demonstrate:
Here is an example of a test that tests the extract_gene
function:
test_that("extract_gene works", {
metadata <- extract_gene(metadata = metadata, expr = counts, genes = 'FOXP3')
expect_true('FOXP3' %in% colnames(metadata))
})
where objects metadata
and counts
are data objects exported by my package. So in the test environment, my package is loaded and I expect to be able to call them like how I've written. But the tests fail, with error messages indicating that the data is missing.
However, if I add a line above my function that interacts with the data, the test will pass:
test_that("extract_gene works", {
dim(counts) # does not print
metadata <- extract_gene(metadata = metadata, expr = counts, genes = 'FOXP3')
expect_true('FOXP3' %in% colnames(metadata))
})
This runs without error, however notably the dim(counts)
call does not print anything. It doesn't produce an error, but it's clear it doesn't actually run. I think it's because the data is "invisible" until interacted with once.
I don't understand this behaviour at all. But for now my workaround is to add some "filler" calls to my data in every test so that the rest of my tests can "see" the data.