Hi @Maninder,
could you try to give your sample data in form of a reprex? It will be much easier for people just to copy and paste self-contained and working code into their RStudio to get the problem and figure out a solution.
Solution idea
The key to your solution looks like a combination of arrange
for sorting by date, group_by
for grouping by state and dplyr::lag
to get a value from previous rows. And yes, thanks to "tidyverse magic" the lag
function will respect the grouping. I'm trying an example with dummy data:
library(tidyverse)
covid <- tibble(STATE = c("NSW", "NT", "QLD")) %>%
mutate(data = map(STATE, ~tibble(DATE = seq(lubridate::today(), by = "1 day", length.out = 4),
NEW_CASES = runif(4, 0, 100)))) %>%
unnest(data)
# and this could be your solution
covid %>%
# order for that lag combines the right rows
arrange(DATE) %>%
group_by(STATE) %>%
mutate(CASES_YESTERDAY = lag(NEW_CASES),
CASES_BEFORE_YESTERDAY = lag(NEW_CASES, 2)) %>%
mutate(GROWTH_RATE_1D = NEW_CASES / CASES_YESTERDAY,
GROWTH_RATE_2D = NEW_CASES / CASES_BEFORE_YESTERDAY) %>%
# re-arrange (not necessary) only to check if the grouping was respected
arrange(STATE, DATE)
#> # A tibble: 12 x 7
#> # Groups: STATE [3]
#> STATE DATE NEW_CASES CASES_YESTERDAY CASES_BEFORE_YE~ GROWTH_RATE_1D
#> <chr> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 NSW 2020-09-23 69.2 NA NA NA
#> 2 NSW 2020-09-24 3.67 69.2 NA 0.0530
#> 3 NSW 2020-09-25 14.8 3.67 69.2 4.05
#> 4 NSW 2020-09-26 14.7 14.8 3.67 0.988
#> 5 NT 2020-09-23 13.5 NA NA NA
#> 6 NT 2020-09-24 91.9 13.5 NA 6.82
#> 7 NT 2020-09-25 62.4 91.9 13.5 0.679
#> 8 NT 2020-09-26 32.2 62.4 91.9 0.516
#> 9 QLD 2020-09-23 58.0 NA NA NA
#> 10 QLD 2020-09-24 50.0 58.0 NA 0.862
#> 11 QLD 2020-09-25 13.9 50.0 58.0 0.278
#> 12 QLD 2020-09-26 71.3 13.9 50.0 5.13
#> # ... with 1 more variable: GROWTH_RATE_2D <dbl>
Created on 2020-09-23 by the reprex package (v0.3.0)