Aggregate and Group to find increase and diff

I have a dataset, df, where I wish to calculate the percent increase of the sum of a particular group over a time period. Here is the dataset:

     date     size       type

1/1/2020     1          a
1/1/2020     1          a
1/1/2020     1          a
1/1/2020     2          b
1/1/2020     5          b
1/1/2020     6          b
2/1/2020     20         a
2/1/2020     21         a
2/1/2020     10         a
2/1/2020     1          b
2/1/2020     4          b     
2/1/2020     5          b

Desired output

(grouping by type and date to find sum)

date        size    type

1/1/2020    3       a
2/1/2020    51      a
1/1/2020    13      b
2/1/2020    10      b 

(finding the increase and diff by type)

Final Desired Output:

 date        type      increase    diff
 
 2/1/2020    a         1600%       48    
 2/1/2020    b        -23.07%      -3



Percent Increase/Change is  final-inital/initial * 100

I was not sure if I had to break this into two separate parts so I added an intermediate desired output.

This is what I am doing:

  df %>% group_by(date, type) %>% mutate_each(funs(pct), c(date, type))

dput:

 structure(list(Date = c("1/1/2020", "1/1/2020", "1/1/2020", "1/1/2020", 
"1/1/2020", "1/1/2020", "2/1/2020", "2/1/2020", "2/1/2020", "2/1/2020", 
"2/1/2020", "2/1/2020"), size = c(1L, 1L, 1L, 2L, 5L, 6L, 20L, 
21L, 10L, 1L, 4L, 5L), type = c("a", "a", "a", "b", "b", "b", 
"a", "a", "a", "b", "b", "b")), class = "data.frame", row.names = c(NA, 

-12L))

Any suggestion is appreciated.
I am still researching the topic.

library(dplyr)

df <-  structure(list(Date = c("1/1/2020", "1/1/2020", "1/1/2020", "1/1/2020", 
                               "1/1/2020", "1/1/2020", "2/1/2020", "2/1/2020", "2/1/2020", "2/1/2020", 
                               "2/1/2020", "2/1/2020"), size = c(1L, 1L, 1L, 2L, 5L, 6L, 20L, 
                                                                 21L, 10L, 1L, 4L, 5L), type = c("a", "a", "a", "b", "b", "b", 
                                                                                                 "a", "a", "a", "b", "b", "b")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                     -12L))
df %>% 
    group_by(Date, type) %>% 
    summarise(size = sum(size)) %>%
    arrange(type, Date) %>% 
    group_by(type) %>% 
    summarise(Date = last(Date),
              diff = last(size) - first(size),
              increase = scales::percent(diff / first(size)))
#> `summarise()` regrouping output by 'Date' (override with `.groups` argument)
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 4
#>   type  Date      diff increase
#>   <chr> <chr>    <int> <chr>   
#> 1 a     2/1/2020    48 1 600%  
#> 2 b     2/1/2020    -3 -23%

Created on 2020-11-04 by the reprex package (v0.3.0.9001)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.