I would like to replace the category column with the category corresponding to the max value in the sales column.

My data looks as follows:

`df <- data.frame(CATEGORY = c("A","A","A","B","B"), SALES = c(10,20,30,40,50))`

I'm looking to fill the CATEGORY variable with "B" since the max value in SALES has a CATEGORY of B

`df <- data.frame(CATEGORY = c("B","B","B","B","B"), SALES = c(10,20,30,40,50))`

If this can be achieved using dplyr syntax I'd be very grateful if anyone could give me a few pointers.

Thanks

DavoWW
2
Hi @jgarrigan,

This should do it:

```
suppressPackageStartupMessages(library(tidyverse))
df <- data.frame(CATEGORY = c("A","A","A","B","B"), SALES = c(10,20,30,40,50))
df
#> CATEGORY SALES
#> 1 A 10
#> 2 A 20
#> 3 A 30
#> 4 B 40
#> 5 B 50
# Basic approach
df %>%
mutate(max_cat = CATEGORY[SALES == max(SALES)])
#> CATEGORY SALES max_cat
#> 1 A 10 B
#> 2 A 20 B
#> 3 A 30 B
#> 4 B 40 B
#> 5 B 50 B
# Do it, plus tidy-up the output
df %>%
mutate(max_cat = CATEGORY[SALES == max(SALES)]) %>%
select(-1) %>%
rename(CATEGORY = max_cat) %>%
select(2 ,1)
#> CATEGORY SALES
#> 1 B 10
#> 2 B 20
#> 3 B 30
#> 4 B 40
#> 5 B 50
```

^{Created on 2022-01-29 by the reprex package (v2.0.1)}

1 Like

system
Closed
3
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.