I don't think* that this has much (or anything) to do with what you are doing in your code. I believe that the STRING_PTR
error is coming from R itself, due to a change in R's internals that happened between R 3.4.4 and R 3.5.0 (this is what people are discussing in the tidyr
pull request you linked to in the first post).
STRING_PTR
is part of R's internals (it lives in memory.c
). Previously, it was not as picky as it might have been about checking for valid types of things passed to it, which (I gather!) allowed for a bit of a hack that proved useful to tidyverse package authors in some circumstances. In Sep 2017, a member of the R Core Team committed a change that made STRING_PTR
check objects passed to it more carefully, which meant that code that relied on its previous un-pickiness started throwing errors. These errors are coming from deep inside R. In purrrlyr
's case, I do not think there is anything you can do differently in your code to avoid these errors. It's a conflict between how the package is implemented and how R works as of version 3.5.0.
purrrlyr
was created as a container for functions that had been removed from purrr
and dplyr
before the STRING_PTR
change was made to (what became) R 3.5.0. The NEWS for purrrlyr
0.0.1 (appeared on CRAN in April 2017) says:
All data-frame based mappers have been moved to this package. These functions are not technically deprecated (so you can move to this package as easily as possible), but these functions are unlikely to be changed in the future (i.e. there will be no bug fixes) and are likely to go away in the near future, so we highly recommend updating to new approaches.
- Mapping a function to each column of a data frame should now be handled with the colwise mutating and summarising operations in dplyr instead of
dmap()
. These are the verbs with suffix _all()
, _at()
and _if()
, such as mutate_all()
or summarise_if()
. Note that this means the output of .f
should conform to the requirements of dplyr operations: same length as the input for mutating operations, and length 1 for summarising operations.
- Inovking a function row by row with the columns of a data frame as arguments should be done with
pmap()
followed by dplyr::as_dataframe()
instead of map_rows()
.
- Mapping rowwise slices of a data frame with
by_row()
is deprecated in favour of a combination of tidyverse functions. First use tidyr::nest()
to create a list-column containing groupwise data frames. Then use dplyr::mutate()
to operate on this list-column. Typically you will want to apply a function on each element (nested data frame) of this list-column with purrr::map()
.
(emphasis mine)
What I'm taking from all this is that while the STRING_PTR
bug got fixed elsewhere, I suspect that since purrrlyr
isn't currently a development priority, the issue has not yet been fixed in purrrlyr
. You may as well file your example above (or an even more minimal version) in the purrrlyr
issue tracker, but I wouldn't hold out hope for a bug fix anytime soon.
Some options if you (or anybody else!) wants to keep using purrrlyr
for tasks that are throwing this STRING_PTR
error:
- Use it with R < 3.5 (see below). But, this is obviously unappealing as a long-term plan.
- Fork
purrrlyr
and fix the problem (maybe looking to the similar work that was done on tidyr
or other tidyverse packages for inspiration). But, this requires acquiring the necessary programming skills, or teaming up with somebody else who has them.
Or you can choose to start moving to the alternative methods outlined in the NEWS file or all the other links earlier in this thread. I'm sorry that's not better news!
* I have not dug super deep into this, so all of the above should be understood as a hypothesis
purrrlyr
still works with R < 3.5
Your example in R 3.5.0 (throws error)
library(purrrlyr)
# dataframe
dat <- structure( list( vars = c("var_1", "var_2"), data = list( structure(
list(time = 1:10, value = c(1:10)), row.names = c(NA,-10L), class =
c("tbl_df", "tbl", "data.frame") ), structure( list(time = 1:10, value =
c(11:20)), row.names = c(NA,-10L), class = c("tbl_df", "tbl", "data.frame")
) ), mu = c(1, 2), stdev = c(1, 2) ), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA,-2L) )
# applying operation row-wise
dat %>%
purrrlyr::by_row(
.d = .,
..f = ~dnorm(x = .$data[[1]]$value[[1]], mean = .$mu[[1]], sd = .$stdev[[1]]),
collate = "rows"
)
#> Error in purrrlyr::by_row(.d = ., ..f = ~dnorm(x = .$data[[1]]$value[[1]], : STRING_PTR() can only be applied to a 'character', not a 'list'
Created on 2018-11-14 by the reprex package (v0.2.1)
Session info
sessionInfo()
#> R version 3.5.0 (2018-04-23)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 16.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
#> LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] purrrlyr_0.0.3
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.0 knitr_1.20 bindr_0.1.1 magrittr_1.5
#> [5] tidyselect_0.2.5 R6_2.3.0 rlang_0.3.0.1 stringr_1.3.1
#> [9] dplyr_0.7.8 tools_3.5.0 htmltools_0.3.6 yaml_2.2.0
#> [13] rprojroot_1.3-2 digest_0.6.18 assertthat_0.2.0 tibble_1.4.2
#> [17] crayon_1.3.4 bindrcpp_0.2.2 purrr_0.2.5 glue_1.3.0
#> [21] evaluate_0.12 rmarkdown_1.10 stringi_1.2.4 compiler_3.5.0
#> [25] pillar_1.3.0 backports_1.1.2 pkgconfig_2.0.2
Your example in R 3.4.4 (runs successfully)
library(purrrlyr)
# dataframe
dat <- structure( list( vars = c("var_1", "var_2"), data = list( structure(
list(time = 1:10, value = c(1:10)), row.names = c(NA,-10L), class =
c("tbl_df", "tbl", "data.frame") ), structure( list(time = 1:10, value =
c(11:20)), row.names = c(NA,-10L), class = c("tbl_df", "tbl", "data.frame")
) ), mu = c(1, 2), stdev = c(1, 2) ), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA,-2L) )
# applying operation row-wise
dat %>%
purrrlyr::by_row(
.d = .,
..f = ~dnorm(x = .$data[[1]]$value[[1]], mean = .$mu[[1]], sd = .$stdev[[1]]),
collate = "rows"
)
#> # tibble [2 × 5]
#> vars data mu stdev .out
#> <chr> <list> <dbl> <dbl> <list>
#> 1 var_1 <tibble [10 × 2]> 1 1 <dbl [1]>
#> 2 var_2 <tibble [10 × 2]> 2 2 <dbl [1]>
Created on 2018-11-15 by the reprex package (v0.2.1)
Session info
sessionInfo()
#> R version 3.4.4 (2018-03-15)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 16.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
#> LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] purrrlyr_0.0.3
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.0 knitr_1.20 bindr_0.1.1 magrittr_1.5
#> [5] tidyselect_0.2.5 R6_2.3.0 rlang_0.3.0.1 fansi_0.4.0
#> [9] stringr_1.3.1 dplyr_0.7.8 tools_3.4.4 utf8_1.1.4
#> [13] cli_1.0.1 htmltools_0.3.6 yaml_2.2.0 rprojroot_1.3-2
#> [17] digest_0.6.18 assertthat_0.2.0 tibble_1.4.2 crayon_1.3.4
#> [21] bindrcpp_0.2.2 purrr_0.2.5 glue_1.3.0 evaluate_0.12
#> [25] rmarkdown_1.10 stringi_1.2.4 compiler_3.4.4 pillar_1.3.0
#> [29] backports_1.1.2 pkgconfig_2.0.2