I would like to replace the category column with the category corresponding to the max value in the sales column.
My data looks as follows:
df <- data.frame(CATEGORY = c("A","A","A","B","B"), SALES = c(10,20,30,40,50))
I'm looking to fill the CATEGORY variable with "B" since the max value in SALES has a CATEGORY of B
df <- data.frame(CATEGORY = c("B","B","B","B","B"), SALES = c(10,20,30,40,50))
If this can be achieved using dplyr syntax I'd be very grateful if anyone could give me a few pointers.
Thanks
DavoWW
2
Hi @jgarrigan,
This should do it:
suppressPackageStartupMessages(library(tidyverse))
df <- data.frame(CATEGORY = c("A","A","A","B","B"), SALES = c(10,20,30,40,50))
df
#> CATEGORY SALES
#> 1 A 10
#> 2 A 20
#> 3 A 30
#> 4 B 40
#> 5 B 50
# Basic approach
df %>%
mutate(max_cat = CATEGORY[SALES == max(SALES)])
#> CATEGORY SALES max_cat
#> 1 A 10 B
#> 2 A 20 B
#> 3 A 30 B
#> 4 B 40 B
#> 5 B 50 B
# Do it, plus tidy-up the output
df %>%
mutate(max_cat = CATEGORY[SALES == max(SALES)]) %>%
select(-1) %>%
rename(CATEGORY = max_cat) %>%
select(2 ,1)
#> CATEGORY SALES
#> 1 B 10
#> 2 B 20
#> 3 B 30
#> 4 B 40
#> 5 B 50
Created on 2022-01-29 by the reprex package (v2.0.1)
1 Like
system
Closed
3
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.