I must be missing something. I want to fit a particular distribution to subsets of my data. When I do it within a mutate()
call, I end up with warnings about NaN produced, but if I run it outside of mutate()
I see no such warning.
The output is the same, and I can't see where the NaN values are supposed to be.
library(tidyverse)
# fake data
dat <- tibble(grp = rep(c("A","B"), each = 100),
vals = c(rnbinom(100, size = 1, prob = .5),
rnbinom(100, size = 10, prob = .2)))
# run within `mutate()`
fit1 <- dat |>
group_by(grp) |>
nest() |>
mutate(model = map(data, ~ fitdistrplus::fitdist(data = .x[["vals"]],
distr = "nbinom"))) |>
select(-data)
#> Warning: There were 4 warnings in `mutate()`.
#> The first warning was:
#> ℹ In argument: `model = map(data, ~fitdistrplus::fitdist(data = .x[["vals"]],
#> distr = "nbinom"))`.
#> ℹ In group 1: `grp = "A"`.
#> Caused by warning in `dnbinom()`:
#> ! NaNs produced
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 3 remaining warnings.
# run outside of `mutate()`
nested <- dat |>
group_by(grp) |>
nest()
fit2 <- map(nested$data,
~ fitdistrplus::fitdist(data = .x[["vals"]], distr = "nbinom"))
all.equal(fit1$model, fit2)
#> [1] TRUE
Created on 2023-03-30 with reprex v2.0.2
Part of my question is whether I should be worried about those warnings, but I'm also just curious why running the same command within mutate()
or not would change anything.