R: Using lag with an index variable

irasharenow100 · March 20, 2023, 10:40pm

I am trying to compute the percent change in sales in California cities from 2012 to 2022. If every city had sales in both years and all years in-between, then I could just use lag() to go back 10 rows, but I want to go back 10 years. In my example dataset, both Oakland and LA are missing 2012 data. A city that is missing one or both years should not appear in my final result.
I actually solved the problem by adding a column with 2022 – year and then filtering on that value being 10. I was also able to solve the problem with base R techniques. But I am hoping there is an option for a tidyverse function that simplifies the process. I looked at slider. I think I need a “lag_index()” instead of slide_index().
Is there a tidy function or a function in another package that solves this problem?
https://cran.r-project.org/web/packages/slider/vignettes/slider.html

Only cities that had sales in both 2012 and 2022 should appear in the output.

# sample data
tax = data.frame(city = c(rep('SF', 22), rep('Oakland', 21), rep('LA', 3), rep('SJ', 22)), 
                 year = c(2001:2022, 2001:2011, 2013:2022, 2020:2022, 2001:2022),
                 sales = c(1:22, rep(5, 11), 0, rep(6, 9), 18:20, seq(25, by = 2, length.out = 22)))
tax

# inelegant solution
tax |> 
  group_by(city) |> 
  mutate(diff12_22 = year - lag(year, 10),
         sales18 = lag(sales, 4, order_by = year),
         sales12 = lag(sales, 10, order_by = year),
         change12_22 = (sales/sales12-1)) |> 
  filter(year == 2022) |> 
  filter(diff12_22 == 10) |> 
  drop_na() |> ungroup() |> 
  mutate(rank12_22 = min_rank(desc(change12_22))) |> 
  print(n = 50)

pomchip · March 24, 2023, 8:06pm

What about a left_join -based approach:

tax %>% 
  left_join(
    tax %>% 
      filter( year == 2012 ) %>% 
      select( city, sales ) %>% 
      rename( sales_2012 = sales ),
    by = 'city'
  ) %>%
  filter( year == 2022 & !is.na(sales_2012) ) %>% 
  mutate( pct_change = 100*(sales-sales_2012)/sales_2012 )

system · April 14, 2023, 8:06pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.