Hi all, I need to create a new variable that ranks another variable, by value. Essentially, we had participants in a recent study rate how hard they found 6 different things to be, and we now want to have a rank-order of these things for each person. I've been messing with this code forever though and can't get it right.
So far I've been using this method:
dat=tibble::tribble(~name, ~score,
"bob", 0,
"bob", 5,
"bob", 50,
"bob", 50,
"bob", 50,
"bob", NA)
dat=dat %>% mutate(rank=rank(score,
ties.method = "max", na.last = FALSE))
# Flip the ranks around so they are highest to lowest
dat$rank=car::recode(dat$rank,"1 = 6 ; 2 = 5 ; 3 = 4 ; 4 = 3 ; 5 = 2 ; 6 = 1")
dat
The problem with this is that it assigns ranks like you see in sporting events; that is, if three people tied for second place, the rankings look like 1,2,2,2,5. This is not what I want...I need it to be 1,2,2,2,3. The current way I'm getting the ranks makes it look like there are several things missing in the data that aren't ranked when in reality there is only one NA.
The group id number is the rank value for your needs. I added another participant to make sure this would work with your data set. The highest score number is ranked = 1. If that is incorrect, drop the desc() in the rank function. I was also unsure what you want to with NAs. They appear with the lowest rank (highest ranking value).
library(tidyverse)
dat=tibble::tribble(~name, ~score,
"bob", 0,
"bob", 5,
"bob", 50,
"bob", 50,
"bob", 60,
"bob", NA,
"sue", 0,
"sue", 25,
"sue", 50,
"sue", 50,
"sue", 60,
"sue", 25)
datr <- dat %>% group_by(name) %>%
mutate(ranked = rank(desc(score), ties.method = "max", na.last = TRUE)) %>%
arrange(name, ranked)
datr %>% group_modify(~ .x %>% group_by(ranked) %>% mutate(id = cur_group_id()))
#> # A tibble: 12 × 4
#> # Groups: name [2]
#> name score ranked id
#> <chr> <dbl> <int> <int>
#> 1 bob 60 1 1
#> 2 bob 50 3 2
#> 3 bob 50 3 2
#> 4 bob 5 4 3
#> 5 bob 0 5 4
#> 6 bob NA 6 5
#> 7 sue 60 1 1
#> 8 sue 50 3 2
#> 9 sue 50 3 2
#> 10 sue 25 5 3
#> 11 sue 25 5 3
#> 12 sue 0 6 4
Turns out a slight variation on this was all I needed! dense_rank() was the key!!! Combined with the second and third lines you gave me this solved everything, thanks!
Oh, good grief! There was a simple solution after all. Still, I had fun and learned about group id. You actually need just one line, reversing the order with desc(score):
library(tidyverse)
dat=tibble::tribble(~name, ~score,
"bob", 0,
"bob", 5,
"bob", 50,
"bob", 50,
"bob", 50,
"bob", NA)
dat %>% mutate(ranked = dense_rank(desc(score)))
#> # A tibble: 6 × 3
#> name score ranked
#> <chr> <dbl> <int>
#> 1 bob 0 3
#> 2 bob 5 2
#> 3 bob 50 1
#> 4 bob 50 1
#> 5 bob 50 1
#> 6 bob NA NA
Excellent, thanks!! Yeah I can't believe how many damn hours it took for me to find this solution. I wish the help documentation on the dplyr ranking verbs was more clear or I wouldn't have wasted so much time with base R's version.