Predict on lag prediction within a grouped data frame

I'm spinning my wheels on a nitty gritty r chunk that I'll attempt to illustrate with diamonds data set below. But in short, my problem is that I need previous predictions as input to new predictions but within a grouped data frame. So I have two complexities to deal with here. The fact I need something like predict(model, target ~ log(lag(previous prediction))) and also the fact that using a cumulative sum within groups. So the lag is a within groups lag.

Some example code:

mydiamonds <- diamonds %>%
  group_by(cut, color) %>% 
  mutate(rn = row_number()) %>% 
  arrange(cut, color, rn) %>% 
  mutate(CumPrice = cumsum(price)) = glm(CumPrice ~ log(lag(CumPrice)) + cut + color, family = "poisson", data = mydiamonds)

With new data, I will not know what the CumPrice is except for the initial value at rn == 1. I want to predict it for each row where the previous row is an input to it. Again, this is within groups so I cannot apply the model across the raw df.

mydiamonds.test <- mydiamonds %>% select(-CumPrice)

Pretend that mydiamonds.test is completely new hold out data that doesn't contain CumPrice which is both a target and a predictor (log(lag(CumPrice))).

How could I predict onto mydiamonds.test?

Added purrr tag since someone suggested purrr:accumulate() which I'm looking over just now

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.