Hi, I have this simple df with recoded variables:
source <- data.frame(
stringsAsFactors = FALSE,
Mileage = c(600, 10, 50, 100, 1200, 500, 1000, 2000, 1500, 300, 750),
Model = c("aaa","aaa","aaa","aaa",
"aaa","bbb","bbb","bbb","bbb","bbb","bbb"),
RoTotal = c(0.632652484490701,
0.185310835928467,0.577469486919075,0.68712277811571,
0.886954372144862,0.47168592416002,0.488694716005684,
0.863730894475365,0.366193956213799,0.695052242862401,
0.714609567675162)
)
library(dplyr)
source <- source %>% mutate(
Mileage.Bucket = case_when(
Mileage >= 0 & Mileage <=100 ~ 1,
Mileage > 100 & Mileage <=500 ~ 2,
Mileage > 500 & Mileage <=1000 ~ 3,
Mileage > 1000 & Mileage <=1500 ~ 4,
Mileage > 1500 ~ 5))
source$Mileage.Bucket <- factor(as.numeric(source$Mileage.Bucket),
levels = c(1, 2, 3, 4, 5),
labels = c("0-100", "100-500", "500-1000", "1000-1500", "More than 1500"))
table(source$Mileage.Bucket)
I used numbers instead of texts in the mutate deliberately to have values in order (0-100, 100-500, 500-1000, 1000-1500 More than 1500).
Unfortunately, when I run tables using dplyr, the order of Mileage.Bucket is treated as text (I think) and, as a result, I have illogical order of Mileage buckets.
result.table <- source %>%
bind_rows(mutate(.data = ., Mileage.Bucket = "Total")) %>%
group_by(Model, Mileage.Bucket) %>%
summarise_at(.vars = vars(ends_with(match = "RoTotal")),.funs = list(Aver = ~mean(.,na.rm=TRUE), Count = ~sum(!is.na(.))))
result.table
Is there a way of ordering Mileage.Bucket by values ( so 1, 2, 3, 4 and 5) without using this manual arrange command?
arrange(match(Mileage.Bucket, c("0-100", "100-500", "500-1000", "More than 1500")))
I am lack of further ideas...