Mutate by grouping iso year week

Given iso year- week given, how can I mutate a new variable, say, wanted, which will repeat a value for every four weeks, and the value will be the last week of the four-week grouping?

# Example first 8 rows
> df %>% 
+   slice(1:8)
# A tibble: 8 x 2
    given  wanted
    <dbl>   <dbl>
1 2016002 2016005
2 2016003 2016005
3 2016004 2016005
4 2016005 2016005
5 2016006 2016009
6 2016007 2016009
7 2016008 2016009
8 2016009 2016009
# Example: rows 45 to 52
> df %>% 
+   slice(45:52)
# A tibble: 8 x 2
    given  wanted
    <dbl>   <dbl>
1 2016046 2016049
2 2016047 2016049
3 2016048 2016049
4 2016049 2016049
5 2016050 2017001
6 2016051 2017001
7 2016052 2017001
8 2017001 2017001

Data:

library(dplyr)
#  Data
df <- structure(list(given = c(2016002, 2016003, 2016004, 2016005, 
                         2016006, 2016007, 2016008, 2016009, 2016010, 2016011, 2016012, 
                         2016013, 2016014, 2016015, 2016016, 2016017, 2016018, 2016019, 
                         2016020, 2016021, 2016022, 2016023, 2016024, 2016025, 2016026, 
                         2016027, 2016028, 2016029, 2016030, 2016031, 2016032, 2016033, 
                         2016034, 2016035, 2016036, 2016037, 2016038, 2016039, 2016040, 
                         2016041, 2016042, 2016043, 2016044, 2016045, 2016046, 2016047, 
                         2016048, 2016049, 2016050, 2016051, 2016052, 2017001, 2017002, 
                         2017003, 2017004, 2017005, 2017006, 2017007, 2017008, 2017009, 
                         2017010, 2017011, 2017012, 2017013, 2017014, 2017015, 2017016, 
                         2017017, 2017018, 2017019, 2017020, 2017021, 2017022, 2017023, 
                         2017024, 2017025, 2017026, 2017027, 2017028, 2017029, 2017030, 
                         2017031, 2017032, 2017033, 2017034, 2017035, 2017036, 2017037, 
                         2017038, 2017039, 2017040, 2017041, 2017042, 2017043, 2017044, 
                         2017045, 2017046, 2017047, 2017048, 2017049, 2017050, 2017051, 
                         2017052, 2018001, 2018002, 2018003, 2018004, 2018005, 2018006, 
                         2018007, 2018008, 2018009, 2018010, 2018011, 2018012, 2018013, 
                         2018014, 2018015, 2018016, 2018017, 2018018, 2018019, 2018020, 
                         2018021, 2018022, 2018023, 2018024, 2018025, 2018026, 2018027, 
                         2018028, 2018029, 2018030, 2018031, 2018032, 2018033, 2018034, 
                         2018035, 2018036, 2018037, 2018038, 2018039, 2018040, 2018041, 
                         2018042, 2018043, 2018044, 2018045, 2018046, 2018047, 2018048, 
                         2018049, 2018050, 2018051, 2018052, 2019001, 2019002, 2019003, 
                         2019004, 2019005, 2019006, 2019007, 2019008, 2019009, 2019010, 
                         2019011, 2019012, 2019013, 2019014, 2019015, 2019016, 2019017, 
                         2019018, 2019019, 2019020, 2019021, 2019022, 2019023, 2019024, 
                         2019025, 2019026, 2019027, 2019028, 2019029, 2019030, 2019031, 
                         2019032, 2019033, 2019034, 2019035, 2019036, 2019037, 2019038, 
                         2019039, 2019040, 2019041, 2019042, 2019043, 2019044, 2019045, 
                         2019046, 2019047, 2019048, 2019049, 2019050, 2019051, 2019052, 
                         2020001, 2020002, 2020003, 2020004, 2020005, 2020006, 2020007, 
                         2020008, 2020009, 2020010, 2020011, 2020012, 2020013, 2020014, 
                         2020015, 2020016, 2020017, 2020018, 2020019, 2020020, 2020021, 
                         2020022, 2020023, 2020024, 2020025, 2020026, 2020027, 2020028, 
                         2020029, 2020030, 2020031, 2020032, 2020033, 2020034, 2020035, 
                         2020036, 2020037, 2020038, 2020039, 2020040, 2020041, 2020042, 
                         2020043, 2020044, 2020045, 2020046, 2020047, 2020048, 2020049, 
                         2020050, 2020051, 2020052, 2020053, 2021001, 2021002, 2021003, 
                         2021004, 2021005, 2021006, 2021007, 2021008, 2021009, 2021010, 
                         2021011, 2021012, 2021013, 2021014, 2021015, 2021016, 2021017, 
                         2021018, 2021019, 2021020, 2021021, 2021022, 2021023, 2021024, 
                         2021025, 2021026, 2021027, 2021028, 2021029, 2021030, 2021031, 
                         2021032, 2021033, 2021034, 2021035, 2021036, 2021037, 2021038, 
                         2021039, 2021040, 2021041, 2021042, 2021043, 2021044, 2021045, 
                         2021046, 2021047, 2021048, 2021049, 2021050, 2021051, 2021052
), wanted = c(2016005, 2016005, 2016005, 2016005, 2016009, 2016009, 
              2016009, 2016009, 2016013, 2016013, 2016013, 2016013, 2016017, 
              2016017, 2016017, 2016017, 2016021, 2016021, 2016021, 2016021, 
              2016025, 2016025, 2016025, 2016025, 2016029, 2016029, 2016029, 
              2016029, 2016033, 2016033, 2016033, 2016033, 2016037, 2016037, 
              2016037, 2016037, 2016041, 2016041, 2016041, 2016041, 2016045, 
              2016045, 2016045, 2016045, 2016049, 2016049, 2016049, 2016049, 
              2017001, 2017001, 2017001, 2017001, 2017005, 2017005, 2017005, 
              2017005, 2017009, 2017009, 2017009, 2017009, 2017013, 2017013, 
              2017013, 2017013, 2017017, 2017017, 2017017, 2017017, 2017021, 
              2017021, 2017021, 2017021, 2017025, 2017025, 2017025, 2017025, 
              2017029, 2017029, 2017029, 2017029, 2017033, 2017033, 2017033, 
              2017033, 2017037, 2017037, 2017037, 2017037, 2017041, 2017041, 
              2017041, 2017041, 2017045, 2017045, 2017045, 2017045, 2017049, 
              2017049, 2017049, 2017049, 2018001, 2018001, 2018001, 2018001, 
              2018005, 2018005, 2018005, 2018005, 2018009, 2018009, 2018009, 
              2018009, 2018013, 2018013, 2018013, 2018013, 2018017, 2018017, 
              2018017, 2018017, 2018021, 2018021, 2018021, 2018021, 2018025, 
              2018025, 2018025, 2018025, 2018029, 2018029, 2018029, 2018029, 
              2018033, 2018033, 2018033, 2018033, 2018037, 2018037, 2018037, 
              2018037, 2018041, 2018041, 2018041, 2018041, 2018045, 2018045, 
              2018045, 2018045, 2018049, 2018049, 2018049, 2018049, 2019001, 
              2019001, 2019001, 2019001, 2019005, 2019005, 2019005, 2019005, 
              2019009, 2019009, 2019009, 2019009, 2019013, 2019013, 2019013, 
              2019013, 2019017, 2019017, 2019017, 2019017, 2019021, 2019021, 
              2019021, 2019021, 2019025, 2019025, 2019025, 2019025, 2019029, 
              2019029, 2019029, 2019029, 2019033, 2019033, 2019033, 2019033, 
              2019037, 2019037, 2019037, 2019037, 2019041, 2019041, 2019041, 
              2019041, 2019045, 2019045, 2019045, 2019045, 2019049, 2019049, 
              2019049, 2019049, 2020001, 2020001, 2020001, 2020001, 2020005, 
              2020005, 2020005, 2020005, 2020009, 2020009, 2020009, 2020009, 
              2020013, 2020013, 2020013, 2020013, 2020017, 2020017, 2020017, 
              2020017, 2020021, 2020021, 2020021, 2020021, 2020025, 2020025, 
              2020025, 2020025, 2020029, 2020029, 2020029, 2020029, 2020033, 
              2020033, 2020033, 2020033, 2020037, 2020037, 2020037, 2020037, 
              2020041, 2020041, 2020041, 2020041, 2020045, 2020045, 2020045, 
              2020045, 2020049, 2020049, 2020049, 2020049, 2020053, 2020053, 
              2020053, 2020053, 2021004, 2021004, 2021004, 2021004, 2021008, 
              2021008, 2021008, 2021008, 2021012, 2021012, 2021012, 2021012, 
              2021016, 2021016, 2021016, 2021016, 2021020, 2021020, 2021020, 
              2021020, 2021024, 2021024, 2021024, 2021024, 2021028, 2021028, 
              2021028, 2021028, 2021032, 2021032, 2021032, 2021032, 2021036, 
              2021036, 2021036, 2021036, 2021040, 2021040, 2021040, 2021040, 
              2021044, 2021044, 2021044, 2021044, 2021048, 2021048, 2021048, 
              2021048, 2021052, 2021052, 2021052, 2021052)), row.names = c(NA, 
                                                                           -312L), class = c("tbl_df", "tbl", "data.frame"))

Will this work for you?

DF <- data.frame(given = paste0("201600", 2:9))
DF
#>     given
#> 1 2016002
#> 2 2016003
#> 3 2016004
#> 4 2016005
#> 5 2016006
#> 6 2016007
#> 7 2016008
#> 8 2016009
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
DF <- DF |> mutate(Seq = (row_number() - 1) %/% 4)
DF
#>     given Seq
#> 1 2016002   0
#> 2 2016003   0
#> 3 2016004   0
#> 4 2016005   0
#> 5 2016006   1
#> 6 2016007   1
#> 7 2016008   1
#> 8 2016009   1
DF_4 <- DF |> mutate(Mod4 = row_number() %% 4) |> 
  filter(Mod4 == 0) |> 
  select(wanted = given, Seq)
DF_4
#>    wanted Seq
#> 1 2016005   0
#> 2 2016009   1
DF <- inner_join(DF, DF_4, by = "Seq")
DF
#>     given Seq  wanted
#> 1 2016002   0 2016005
#> 2 2016003   0 2016005
#> 3 2016004   0 2016005
#> 4 2016005   0 2016005
#> 5 2016006   1 2016009
#> 6 2016007   1 2016009
#> 7 2016008   1 2016009
#> 8 2016009   1 2016009

Created on 2022-02-19 by the reprex package (v2.0.1)

1 Like

@FJCC Thanks a lot! It does work for me. Could you please explain your code and approach? I got the big picture...but a little explanation would help me grasp the solution.

You want to label each row with the value of the last row in each four-week grouping. That requires labeling rows in groups of four and labeling every fourth row.
You can create groups of four, or groups of any size N, using integer division, the %/% operator in R. A %/% N calculates how many whole times N goes into A. For the sequence 0, 1, 2, 3, 4, 5, etc. the answers are 0, 0, 0, 0, 1, 1... I made the numbering sequence using the row_number() function, subtracting one from it so the sequence would start at zero. That explains

DF <- DF |> mutate(Seq = (row_number() - 1) %/% 4)

and the rows are now labeled in groups of four.
To label every fourth row, we can make a repeating series of four labels. I chose to do this with the modulus function, %% in R. A %% N returns the remainder of the division of A by N. With N = 4 and a sequence of row numbers, that returns a repeating sequence of 1,2,3, 0 and the fourth row in each group is labeled with zero. That explains

DF |> mutate(Mod4 = row_number() %% 4)

I follow that by filtering the data to keep only the rows where Mod4 is zero with filter(Mod4 == 0). I then select to keep only the given column and the Seq column which labels which group of four each row came from. The given column gets renamed to wanted, the name we want in the final data. I do not keep the Mod4 column because we do not want that in the final data and it has served it purpose. So, now I have the data frame DF_4 that has the given value of every fourth row and the group-of-four from which it came.
The final step is to match each row in DF_4 to its group within the original data. This is what the inner_join does.

DF <- inner_join(DF, DF_4, by = "Seq")

The inner_join looks at each row of DF and matches it with the row (or rows if there were more than one) in DF_4 that has the same value of Seq. The argument by = "Seq" means "use the value of Seq to match the rows". The result is that every row in DF is matched with the fourth row of its own group-of-four.

1 Like

Many thanks for explaining all the steps! I understand it fully now :smiley:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.