Help with map functions

Hi All,

I am trying to use map functions for iterating over specific columns and need help in getting correct output. My original data has 6 columns that would use this scenario. For sample data, I have created 3 columns. All columns starting with Volume depend upon rate and Volume from previous Month. Volume_Mnth_1 is teh only column with all populated values. For rest of the columns starting with Volume, data depends on previous month's data. So, Volume_Mth_2 depends on Volume_Mnth_1 & its rate. Volume_Mnth_3 depends on Volume_Mnth_2 & its rate.

I tried two options as follows:

Option 1. When I try to use function directly, I am getting NAs for Canada and USA in 202310 for Volume_Mnth_2 and several NAs for Volume_Mnth_3. Remaining values are correct For Canada, I would also like to see results based on 202312 for 202401.

Option 2. I tried this first and I was hoping to use pmap or map2 to iterate over columns efficiently and was getting errors on size, I was hoping to use this method on Volume_Mnth_2 & 3 simultaneously. Ideally on all columns starting with Volume except for Volume_Mnth_1 as these will all depend upon previous column from previous months.

library(tidyverse)
library(lubridate)

df <- data.frame(
  stringsAsFactors = FALSE,
              Date = c(202308L,202308L,202308L,
                       202309L,202309L,202309L,202310L,202310L,202310L,202311L,
                       202311L,202311L,202312L,202312L,202312L,202308L,
                       202308L,202308L,202309L,202309L,202309L,202310L,
                       202310L,202310L,202311L,202311L,202311L,202312L,202312L,
                       202312L,202401L,202401L,202401L,202402L,202402L,
                       202402L),
         Indicator = c(0L,0L,0L,0L,0L,0L,1L,1L,
                       1L,1L,1L,1L,1L,1L,1L,0L,0L,0L,0L,0L,0L,1L,
                       1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L),
             State = c("Canada","Canada","Canada",
                       "Canada","Canada","Canada","Canada","Canada","Canada",
                       "Canada","Canada","Canada","Canada","Canada",
                       "Canada","USA","USA","USA","USA","USA","USA","USA",
                       "USA","USA","USA","USA","USA","USA","USA","USA",
                       "USA","USA","USA","USA","USA","USA"),
          Category = c("Soft Goods","Hard Goods",
                       "Hard Goods","Soft Goods","Hard Goods","Hard Goods",
                       "Soft Goods","Hard Goods","Hard Goods","Soft Goods",
                       "Hard Goods","Hard Goods","Soft Goods","Hard Goods",
                       "Hard Goods","Soft Goods","Hard Goods","Hard Goods",
                       "Soft Goods","Hard Goods","Hard Goods","Soft Goods",
                       "Hard Goods","Hard Goods","Soft Goods","Hard Goods",
                       "Hard Goods","Soft Goods","Hard Goods","Hard Goods",
                       "Soft Goods","Hard Goods","Hard Goods","Soft Goods",
                       "Hard Goods","Hard Goods"),
              Type = c("Type A","Type B","Type C",
                       "Type A","Type B","Type C","Type A","Type B","Type C",
                       "Type A","Type B","Type C","Type A","Type B",
                       "Type C","Type A","Type B","Type C","Type A","Type B",
                       "Type C","Type A","Type B","Type C","Type A","Type B",
                       "Type C","Type A","Type B","Type C","Type A",
                       "Type B","Type C","Type A","Type B","Type C"),
           Product = c("Product A","Product B",
                       "Product C","Product A","Product B","Product C",
                       "Product A","Product B","Product C","Product A","Product B",
                       "Product C","Product A","Product B","Product C",
                       "Product A","Product B","Product C","Product A","Product B",
                       "Product C","Product A","Product B","Product C",
                       "Product A","Product B","Product C","Product A",
                       "Product B","Product C","Product A","Product B","Product C",
                       "Product A","Product B","Product C"),
              Rate = c(0.0425,0,0.0185,0.0425,0,
                       0.0185,0.0425,0,0.0185,0.0425,0,0.0185,0.0425,0,
                       0.0185,0.018083333,0,0.014166667,0.018083333,0,
                       0.014166667,0.018083333,0,0.014166667,0.018083333,0,
                       0.014166667,0.018083333,0,0.014166667,0.018083333,0,
                       0.014166667,0.018083333,0,0.014166667),
     Volume_Mnth_1 = c(60L,20L,15L,100L,10L,25L,
                       80L,0L,5L,75L,0L,0L,100L,0L,4L,45L,24L,28L,
                       65L,43L,21L,10L,0L,1L,0L,0L,1L,0L,0L,1L,10L,0L,
                       5L,15L,0L,1L),

# Looking to add data for Volume_Mnth_2 where Indicator is 1 using values from using Volume_Mnth_1  and Rate of Previous month
     Volume_Mnth_2 = c(48L,3L,10L,90L,12L,20L,NA,
                       NA,NA,NA,NA,NA,NA,NA,NA,30L,15L,20L,42L,10L,
                       32L,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
                       NA,NA),

# Looking to add data for Volume_Mnth_3 where Indicator is 1 using values from using Volume_Mnth_2  and Rate of Previous month
     Volume_Mnth_3 = c(NA,NA,NA,NA,NA,NA,NA,NA,
                       NA,NA,NA,NA,NA,NA,NA,40L,21L,12L,30L,18L,15L,
                       NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
                       NA)
)

vol_mnth_fn <- function(vol_mnth, rate){
  vol_mnth_x <- lag(round(vol_mnth * (1-rate), 0))
  
  return(vol_mnth_x)
}

# Option 1
final_vol <- df %>%
  group_by(Indicator, State, Category, Type, Product) %>%
  mutate(Volume_Mnth_2 = ifelse(Indicator == 1, vol_mnth_fn(Volume_Mnth_1, Rate), Volume_Mnth_2)) %>%
  mutate(Volume_Mnth_3 = ifelse(Indicator == 1, vol_mnth_fn(Volume_Mnth_2, Rate), Volume_Mnth_3)) 

# Option2
final_vol_qty <- list(vol_mnth = df$Volume_Mnth_1, rate = df$Rate)

final_vol1 <- df %>%
  group_by(Indicator, State, Category, Type, Product) %>%
  mutate(Volume_Mnth_2 = pmap(final_vol_qty, ~vol_mnth_fn))

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.