Consider the following data:
df <- tribble(
~id, ~reg_date, ~quarter,
"1", "2012-08-22", "2012 Summer",
"2", "2012-07-31", "2012 Summer",
"2", "2014-07-21", "2014 Summer",
"2", "2014-09-14", "2014 Fall",
"3", "2016-01-01", "2016 Winter",
"3", "2016-01-10", "2016 Winter",
"4", "2012-09-10", "2012 Fall",
"4", "2013-01-24", "2013 Winter",
"4", "2013-04-03", "2013 Spring"
)
For each id
, I want to "rank" the quarters in chronological order (e.g. "2012 Summer" before "2012 Fall", "2013 Winter" before "2013 Spring"). I thought I could do that with a combination of group_by
, arrange
, mutate
and dense_rank
, but the quarter_order
variable is off for ids 2 and 4:
df %>%
mutate(reg_date = lubridate::ymd(reg_date)) %>%
group_by(id) %>%
arrange(id, reg_date) %>%
mutate(quarter_order = dense_rank(quarter))
# A tibble: 9 x 4
# Groups: id [4]
id reg_date quarter quarter_order
<chr> <date> <chr> <int>
1 1 2012-08-22 2012 Summer 1
2 2 2012-07-31 2012 Summer 1
3 2 2014-07-21 2014 Summer 3
4 2 2014-09-14 2014 Fall 2
5 3 2016-01-01 2016 Winter 1
6 3 2016-01-10 2016 Winter 1
7 4 2012-09-10 2012 Fall 1
8 4 2013-01-24 2013 Winter 3
9 4 2013-04-03 2013 Spring 2
"2013 Spring", for example, comes after "2013 Winter" (at least when you arrange by reg_date
). Is there another ranking function or method I can use?