Grouping Column Values and Spreading

atakzltn · August 12, 2021, 8:59pm

Hi,

I have a grouped dataframe (grouped by Date, ID and Type). This DF has also a column, Period, that is consist of some days (between 0 and 360). I want to group first 30 days in 5 days range such as 0-5, 5-10, 10-15 etc. and I want to sum Value column grouping Date, ID and Type columns. I want to group all other days after 30 days as 30-90, 90-120, 120-360. After this manipulation I will spread this column with tidyr library.

My DF:

Date	ID	Type	Period	Value
2021-08-07	123	A	1	10
2021-08-07	123	A	3	12
2021-08-07	123	A	9	20
2021-08-07	222	A	15	3
2021-08-08	123	A	1	10
2021-08-08	123	A	3	5
2021-08-08	123	A	9	17
2021-08-08	222	A	15	8

It will be like:

Date	ID	Type	Period	Value
2021-08-07	123	A	0-5	22
2021-08-07	123	A	6-10	20
2021-08-07	222	A	11-15	3
2021-08-08	123	A	0-5	15
2021-08-08	123	A	6-10	17
2021-08-08	222	A	11-15	8

Thank you!

HanOostdijk · August 13, 2021, 7:56pm

Hello @atakzltn .
It is always appreciated if you present your example data in a format that can be used immediately by someone trying to help you.
For example in the way I do it in this code :

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)
library(magrittr)
library(tidyr)
#> 
#> Attaching package: 'tidyr'
#> The following object is masked from 'package:magrittr':
#> 
#>     extract

df <- tibble::tribble(
~Date, ~ID, ~Type, ~Period, ~Value,
"2021-08-07",   123,    "A",    1,  10,
"2021-08-07",   123,  "A",  3,  12,
"2021-08-07",   123,    "A",    9,  20,
"2021-08-07",   222,    "A",    15,     3,
"2021-08-08",   123,    "A",    1,  10,
"2021-08-08",   123,    "A",    3,  5,
"2021-08-08",   123,    "A",    9,  17,
"2021-08-08",   222,    "A",    15,     8)

df %>%
  mutate(perg=as.character(cut(Period,
                  breaks=c(0,6,11,Inf),
                  labels=c("0-5","6-10",">11")))
  ) %>% 
  group_by(Date,ID,Type,perg)  %>%
  summarise(Value=sum(Value)) %>%
  tidyr::pivot_wider(names_from = perg,values_from = Value,values_fill = 0) %>%
  ungroup()
#> `summarise()` has grouped output by 'Date', 'ID', 'Type'. You can override using the `.groups` argument.
#> # A tibble: 4 x 6
#>   Date          ID Type  `0-5` `6-10` `>11`
#>   <chr>      <dbl> <chr> <dbl>  <dbl> <dbl>
#> 1 2021-08-07   123 A        22     20     0
#> 2 2021-08-07   222 A         0      0     3
#> 3 2021-08-08   123 A        15     17     0
#> 4 2021-08-08   222 A         0      0     8
Created on 2021-08-13 by the reprex package (v2.0.0)

system · August 20, 2021, 7:56pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.