I have the below set of data in which I'm trying to create a ranking column. Specifically, I want to create a ranking based on the player's team. For example, Tyreek Hill would get a ranking of 1 and Travis Kelce would get a ranking of 2 (since he's the second-listed player on here). Additionally, in this case, Chris Godwin would get a 1 and Mike Evans would get a 2, while the rest of the players would get a 1 because they're the only one on their team listed. Here's the data:
player team
1 Michael Thomas NO
2 Davante Adams GB
3 Julio Jones ATL
4 Tyreek Hill KC
5 DeAndre Hopkins ARI
6 Travis Kelce KC
7 Chris Godwin TB
8 George Kittle SF
9 Kenny Golladay DET
10 Mike Evans TB
11 Allen Robinson CHI
The dplyr package has the functions group_by, mutate and row_number that are handy for this.
library(dplyr, warn.conflicts = FALSE)
DF <- read.csv("~/R/Play/Dummy.csv", stringsAsFactors = FALSE)
DF
#> player team
#> 1 Michael Thomas NO
#> 2 Davante Adams GB
#> 3 Julio Jones ATL
#> 4 Tyreek Hill KC
#> 5 DeAndre Hopkins ARI
#> 6 Travis Kelce KC
#> 7 Chris Godwin TB
#> 8 George Kittle SF
#> 9 Kenny Golladay DET
#> 10 Mike Evans TB
#> 11 Allen Robinson CHI
DF <- DF %>% group_by(team) %>%
mutate(TeamRowNum = row_number())
DF
#> # A tibble: 11 x 3
#> # Groups: team [9]
#> player team TeamRowNum
#> <chr> <chr> <int>
#> 1 Michael Thomas NO 1
#> 2 Davante Adams GB 1
#> 3 Julio Jones ATL 1
#> 4 Tyreek Hill KC 1
#> 5 DeAndre Hopkins ARI 1
#> 6 Travis Kelce KC 2
#> 7 Chris Godwin TB 1
#> 8 George Kittle SF 1
#> 9 Kenny Golladay DET 1
#> 10 Mike Evans TB 2
#> 11 Allen Robinson CHI 1