# Calculating growth factor for data set.

Hi @Maninder,

could you try to give your sample data in form of a reprex? It will be much easier for people just to copy and paste self-contained and working code into their RStudio to get the problem and figure out a solution.

### Solution idea

The key to your solution looks like a combination of `arrange` for sorting by date, `group_by` for grouping by state and `dplyr::lag` to get a value from previous rows. And yes, thanks to "tidyverse magic" the `lag` function will respect the grouping. I'm trying an example with dummy data:

``````library(tidyverse)

covid <- tibble(STATE = c("NSW", "NT", "QLD")) %>%
mutate(data = map(STATE, ~tibble(DATE = seq(lubridate::today(), by = "1 day", length.out = 4),
NEW_CASES = runif(4, 0, 100)))) %>%
unnest(data)

# and this could be your solution
covid %>%
# order for that lag combines the right rows
arrange(DATE) %>%
group_by(STATE) %>%
mutate(CASES_YESTERDAY = lag(NEW_CASES),
CASES_BEFORE_YESTERDAY = lag(NEW_CASES, 2)) %>%
mutate(GROWTH_RATE_1D = NEW_CASES / CASES_YESTERDAY,
GROWTH_RATE_2D = NEW_CASES / CASES_BEFORE_YESTERDAY) %>%
# re-arrange (not necessary) only to check if the grouping was respected
arrange(STATE, DATE)
#> # A tibble: 12 x 7
#> # Groups:   STATE [3]
#>    STATE DATE       NEW_CASES CASES_YESTERDAY CASES_BEFORE_YE~ GROWTH_RATE_1D
#>    <chr> <date>         <dbl>           <dbl>            <dbl>          <dbl>
#>  1 NSW   2020-09-23     69.2            NA               NA           NA
#>  2 NSW   2020-09-24      3.67           69.2             NA            0.0530
#>  3 NSW   2020-09-25     14.8             3.67            69.2          4.05
#>  4 NSW   2020-09-26     14.7            14.8              3.67         0.988
#>  5 NT    2020-09-23     13.5            NA               NA           NA
#>  6 NT    2020-09-24     91.9            13.5             NA            6.82
#>  7 NT    2020-09-25     62.4            91.9             13.5          0.679
#>  8 NT    2020-09-26     32.2            62.4             91.9          0.516
#>  9 QLD   2020-09-23     58.0            NA               NA           NA
#> 10 QLD   2020-09-24     50.0            58.0             NA            0.862
#> 11 QLD   2020-09-25     13.9            50.0             58.0          0.278
#> 12 QLD   2020-09-26     71.3            13.9             50.0          5.13
#> # ... with 1 more variable: GROWTH_RATE_2D <dbl>
``````

Created on 2020-09-23 by the reprex package (v0.3.0)

1 Like