mutate() with lag and formula: number of columns depends on number of rows

budugulo · December 1, 2022, 1:12pm

Given a data frame df with just one variable v4 with four observations, I can manually mutate the three variables I want (see below). But how can I generalize the problem? For example, if the data frame has 10 rows, the number of columns will be 9.

library(dplyr)

# Toy Data
df <- tibble(v4 = c(20, 10, 8, 0))
df
#> # A tibble: 4 × 1
#>      v4
#>   <dbl>
#> 1    20
#> 2    10
#> 3     8
#> 4     0

# Constants
u <- .6
d <- .4
r <- 1.1

# df_wanted, done manually 
df_wanted <- df %>% 
  mutate(v3 = (u * lag(v4) + d * v4 )/ r,
         v2 = (u * lag(v3) + d * v3)/ r,
         v1 = (u * lag(v2) + d * v2 )/ r)

df_wanted
#> # A tibble: 4 × 4
#>      v4    v3    v2    v1
#>   <dbl> <dbl> <dbl> <dbl>
#> 1    20 NA    NA    NA   
#> 2    10 14.5  NA    NA   
#> 3     8  8.36 11.0  NA   
#> 4     0  4.36  6.15  8.22

^{Created on 2022-12-01 with reprex v2.0.2}

scottyd22 · December 1, 2022, 2:11pm

Below is one way to accomplish this by initially constructing an "empty" data frame with n rows and n columns (n-1 columns added) and then walking through each column using the formula provided.

library(tidyverse)

# Constants
u <- .6
d <- .4
r <- 1.1

# generate n values
n = 6

# first column (sample data)
col1 = data.frame(X0 = seq(100 * n, 100, -100))

# additional n-1 columns
added_columns = data.frame(matrix(nrow = n, ncol = (n - 1)))

# combine into "empty" data frame
df = bind_cols(col1, added_columns)

df
#>    X0 X1 X2 X3 X4 X5
#> 1 600 NA NA NA NA NA
#> 2 500 NA NA NA NA NA
#> 3 400 NA NA NA NA NA
#> 4 300 NA NA NA NA NA
#> 5 200 NA NA NA NA NA
#> 6 100 NA NA NA NA NA

# function to walk through
calc_column = function(i) {
  df[,i] <<- (u * lag(df[,i-1]) + d * df[,i-1])/r
}

# walk through columns of df (starting at column 2)
walk(2:length(df), calc_column)

# final output
df
#>    X0       X1       X2       X3       X4       X5
#> 1 600       NA       NA       NA       NA       NA
#> 2 500 509.0909       NA       NA       NA       NA
#> 3 400 418.1818 429.7521       NA       NA       NA
#> 4 300 327.2727 347.1074 360.6311       NA       NA
#> 5 200 236.3636 264.4628 285.4996 300.5259       NA
#> 6 100 145.4545 181.8182 210.3681 232.2246 248.3685

Created on 2022-12-01 with reprex v2.0.2.9000

nirgrahamuk · December 1, 2022, 4:16pm

Here is another option;


library(tidyverse)
library(glue)
# Toy Data


df <- tibble(v4 = c(20, 10, 8, 0))
df
# Constants
u <- .6
d <- .4
r <- 1.1

what_i_would_type <-  paste0(
  "df %>% mutate(",
      glue_collapse(map(
        3:1,
        ~ glue("v{.x} = (u * lag(v{.x+1}) + d * v{.x+1} )/ r")
      ),
      sep = "\n, "
      ),
  ")"
)

cat(what_i_would_type)

(df_got <- eval(parse(
  text = what_i_would_type
)))

budugulo · December 1, 2022, 10:49pm

@scottyd22 and @nirgrahamuk, Many thanks for your elegant solutions! I have learned many new cool tricks. Thank you again for expanding my R-horizon!

system · January 12, 2023, 10:49pm

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.