I have a dataset, df, where I wish to calculate the percent increase of the sum of a particular group over a time period. Here is the dataset:
date size type
1/1/2020 1 a
1/1/2020 1 a
1/1/2020 1 a
1/1/2020 2 b
1/1/2020 5 b
1/1/2020 6 b
2/1/2020 20 a
2/1/2020 21 a
2/1/2020 10 a
2/1/2020 1 b
2/1/2020 4 b
2/1/2020 5 b
Desired output
(grouping by type and date to find sum)
date size type
1/1/2020 3 a
2/1/2020 51 a
1/1/2020 13 b
2/1/2020 10 b
(finding the increase and diff by type)
Final Desired Output:
date type increase diff
2/1/2020 a 1600% 48
2/1/2020 b -23.07% -3
Percent Increase/Change is final-inital/initial * 100
I was not sure if I had to break this into two separate parts so I added an intermediate desired output.
This is what I am doing:
df %>% group_by(date, type) %>% mutate_each(funs(pct), c(date, type))
dput:
structure(list(Date = c("1/1/2020", "1/1/2020", "1/1/2020", "1/1/2020",
"1/1/2020", "1/1/2020", "2/1/2020", "2/1/2020", "2/1/2020", "2/1/2020",
"2/1/2020", "2/1/2020"), size = c(1L, 1L, 1L, 2L, 5L, 6L, 20L,
21L, 10L, 1L, 4L, 5L), type = c("a", "a", "a", "b", "b", "b",
"a", "a", "a", "b", "b", "b")), class = "data.frame", row.names = c(NA,
-12L))
Any suggestion is appreciated.
I am still researching the topic.