Serializing a function does not give consistent output?

I'm making a simple caching/memoization add-on for an ongoing analysis. I felt it would be pretty simple - given the same function and inputs (and no side effects or global access), return the same output. I'm using the digest function, and it works fairly well. However, after I would modified the file in which I defined the function, I would always have three cache misses rather than one. I've isolated the [first] problem in the snippet below:

test_function <- function(x) {
  mean(x)
}

s1 <- serialize(test_function, connection = NULL, ascii = F)

test_function(1:20)
s2 <- serialize(test_function, connection = NULL, ascii = F)
which(s1 != s2)

test_function(1:20)
s3 <- serialize(test_function, connection = NULL, ascii = F)
which(s2 != s3)

test_function(1:20)
s4 <- serialize(test_function, connection = NULL, ascii = F)
which(s3 != s4)

When I run the function, it changes its serialization between s1 and s2 as well as s2 and s3. The first changes only one byte, and the second seems to nearly double the length of the serialization. Can someone tell me what's going on here? I need to be able to reuse results quickly if it's the same, but to automatically re-run the whole function if the code has changed (as I am in the middle of writing and modifying functions).

And for what it's worth, it seems like R.cache has the same problem? I'm not 100% sure.

I don't know why the function serialization changes when run (and even worse if you display test_function without running), but if you're just interested in changes to the code of the function, you can compare only the body:

test_function <- function(x) {
  mean(x)
}

s1 <- serialize(body(test_function), connection = NULL, ascii = F)

test_function(1:20)
#> [1] 10.5
s2 <- serialize(body(test_function), connection = NULL, ascii = F)
which(s1 != s2)
#> integer(0)

test_function(1:20)
#> [1] 10.5
s3 <- serialize(body(test_function), connection = NULL, ascii = F)
which(s2 != s3)
#> integer(0)

test_function(1:20)
#> [1] 10.5
s4 <- serialize(body(test_function), connection = NULL, ascii = F)
which(s3 != s4)
#> integer(0)

And it also seems to work directly for a memoized function.

Amazing! No need to dive deep into the inner workings of serialize if there was no need to. This looks great. Thanks!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.