# Use lag within mutate to refer to previous rows mutation

I'd like to include values generated for previous rows as inputs to a mutate calculation. Some data:

``````mydiamonds <- diamonds %>%
mutate(Ideal = ifelse(cut == 'Ideal', 1, 0)) %>%
group_by(Ideal) %>%
mutate(rn = row_number()) %>%
arrange(Ideal, rn) %>%
mutate(CumPrice = cumsum(price)) %>%
mutate(InitialPrice = min(price)) %>%
select(Ideal, rn, CumPrice, InitialPrice)
``````

Looks like this:

``````mydiamonds %>% head
# A tibble: 6 x 4
# Groups:   Ideal [1]
Ideal    rn CumPrice InitialPrice
<dbl> <int>    <int>        <int>
1     0     1      326          326
2     0     2      653          326
3     0     3      987          326
4     0     4     1322          326
5     0     5     1658          326
6     0     6     1994          326
``````

A model:

``````mod.diamonds = glm(CumPrice ~ log(lag(CumPrice)) +log(rn) + Ideal , family = "poisson", data = mydiamonds)
``````

Test the model:

``````# new data, pretend we don't know CumPrice but want to use predictions to predict subsequent predictions
mydiamonds.testdata <- mydiamonds %>% select(-CumPrice)
# manual prediction based on lag(prediction), for the first row in each group use InitialPrice
coeffs <- mod.diamonds\$coefficients
mydiamonds.testdata <- mydiamonds.testdata %>%
mutate(CoefIntercept = coeffs['(Intercept)'],
CoefLogLagCumPrice = coeffs['log(lag(CumPrice))'],
CoefLogRn = coeffs['log(rn)'],
CoefIdeal = coeffs['Ideal']
)
``````

Here's how my test data look:

`````` mydiamonds.testdata %>% head
# A tibble: 6 x 7
# Groups:   Ideal [1]
Ideal    rn InitialPrice CoefIntercept CoefLogLagCumPrice CoefLogRn CoefIdeal
<dbl> <int>        <int>         <dbl>              <dbl>     <dbl>     <dbl>
1     0     1          326        0.0931              0.987    0.0154 -0.000715
2     0     2          326        0.0931              0.987    0.0154 -0.000715
3     0     3          326        0.0931              0.987    0.0154 -0.000715
4     0     4          326        0.0931              0.987    0.0154 -0.000715
5     0     5          326        0.0931              0.987    0.0154 -0.000715
6     0     6          326        0.0931              0.987    0.0154 -0.000715
``````

Cannot use predict(), since I need to recursively predict where predictions for the previous day/row are input to the current day. Instead try manual prediction using the coefficents:

``````# prediction
mydiamonds.testdata <- mydiamonds.testdata %>%
mutate(
Prediction = CoefIntercept +

# here's the hard bit. If it's the first row in the group, use InitialPrice, else use the value of the previous prediction
(CoefLogLagCumPrice * ifelse(rn == 1, InitialPrice, lag(Prediction))) +

(CoefLogRn * log(rn)) +
(CoefIdeal * Ideal)
)
``````

Error: Problem with `mutate()` input `Prediction`. x object
'Prediction' not found Input `Prediction` is `+...`. The error
occurred in group 1: Ideal = 0.

How can I mutate in this way, where I'd like to refer to the previous rows mutation? (Unless it's the very first row, in which case use InitialPrice)

A suggestion

``````
#make lagCumPrice in the input
(mydiamonds2 <- mydiamonds %>% ungroup %>%
mutate(lagCumPrice=if_else(rn==1,InitialPrice,lag(CumPrice))))

mod.diamonds = glm(CumPrice ~
#use it
log(lagCumPrice) +log(rn)
+ Ideal ,
family = "poisson", data = mydiamonds2)

(mydiamonds3 <- mydiamonds2 %>% mutate(raw_pred = predict(mod.diamonds,newdata=.),
exp_pred = exp(raw_pred)) )``````

Hi thanks for the suggestion. The issue is though that CumPrice for the previous row is a prediction not an input. On training data it's there but I want to test the model. So as part of the test the CumPrice on the first row is the InitialPrice. On subsequent rows it's the prediction from the previous row

So the solution won't work if I'm reading it correctly.

I gave it a shot with accumulate, a function I'm less familiar with. Maybe someone can see how I'm using it wrong or if I can even use accumulate in this way?

``````mydiamonds.testdata <- mydiamonds.testdata %>%
mutate(
Prediction = accumulate(.f = function(.) {

.\$CoefIntercept +

# here's the hard bit. If it's the first row in the group, use InitialPrice, else use the value of the previous prediction
(.\$CoefLogLagCumPrice * ifelse(.\$rn == 1, .\$InitialPrice, lag(.\$Prediction))) +

(.\$CoefLogRn * log(.\$rn)) +
(.\$CoefIdeal * .\$Ideal)

}))
``````

Results in error:
< Error: Problem with `mutate()` input `Prediction`.
x argument ".x" is missing, with no default
Input `Prediction` is `accumulate(...)`.
The error occurred in group 1: Ideal = 0.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.