Ordering data by the sum of rows and applying group numbers

Hey all,

Thanks in advance for the help. I'm trying to order data by levels of one factor, and then by the sum of numeric data associated with another factor. Secondly, I'm trying to apply a group number. I want to do this without manual input, such that a script can be saved for different, future data. To clarify, I've created an example dataset:

library(dplyr)

Year <- c("2019", "2019", "2019", "2019", "2019", "2019", "2020",
"2020", "2020", "2019", "2019", "2019", "2020", "2020", "2020", "2019", "2019", "2019")
Model <- factor(c("Mustang", "Mustang", "Mustang", "F150", "F150", "F150", "F150", "F150",
"F150", "Cruise", "Cruise", "Cruise", "Camaro", "Camaro", "Camaro", "Cruise",
"Cruise", "Cruise"))
Make <- factor(c("Ford", "Ford", "Ford", "Ford", "Ford", "Ford", "Ford", "Ford", "Ford",
"Chevy", "Chevy", "Chevy", "Chevy", "Chevy", "Chevy", "Toyota", "Toyota", "Toyota"),
levels = c("Ford", "Chevy", "Toyota"))
Color <- factor(c("Red", "Blue", "Green", "Red", "Blue", "Green", "Red", "Blue", "Green",
"Red", "Blue", "Green", "Red", "Blue", "Green", "Red", "Blue", "Green"),
levels = c("Red", "Blue", "Green"))
Purchases <- c(20, 35, 10, 32, 49, 45, 37, 53, 25, 55, 24, 13, 32, 45, 60, 12, 20, 13)

Data <- data.frame(Year, Model, Make, Color, Purchases)

Data.order <- Data %>%
group_by(Make,Model) %>%
summarize(Sum=sum(Purchases)) %>%
arrange(Make,-Sum)

The 1st level of arrangement must be the Make, and then I want to re-order the Models according to the sum of purchases of the 3 different colors. Year is not relevant for ordering. I've attached a screenshot of the end goal.

Progress/issue: I've tried to sum and order the data into another data frame, and then reorder the levels of the primary data frame according to the new levels taken from the sum data frame, before arranging accordingly. However, this approach isn't working because it turns out I have repeated levels in the Models, due to different Makes sharing the same Model name, "Cruise". Perhaps I can apply a consecutive group number sequence within the new Order data frame, and then cross-match the group numbers back to primary data frame to arrange by Type, Group number?

Thanks a lot everyone!!

levels(Data$Model)
(new_levels <- pull(Data.order,Model) %>% as.character())
Data$Model <- factor(Data$Model,levels=new_levels)
Data

Data %>% arrange(Make,Model) %>% mutate(mmgroup = as.integer(forcats::as_factor(paste0(Model,Make))))
1 Like

@nirgrahamuk You are my HERO thanks a lot!

@nirgrahamuk Hey Nir, I've hit an issue. In my actual data set, I have repeated names in the Ordering data frame that I'm trying to create the new levels from. Say for example, that Honda and Cadillac also have a model named "Cruise." So it's not letting me create the new leveled factor with duplicated names in the column. Do you have any advice?

I'm not very clear on the ideal behaviour, but if you simply want to avoid an error of duplicate levels you can make them unique...

(new_levels <- pull(Data.order,Model) %>% as.character() %>% unique)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.