Slavek
October 24, 2019, 2:09pm
1
Hi, andresrcs has kindly helped me to group elements in "Model" based on its frequency in the data frame. Now I'm trying to use something a bit more complicated. I would like to apply the rule but not based on overall frequency but frequency in 2020.
Basically, only "bb" and "cc" should stay (prop>=0.4), other Models should be coded as "Other".
The code below takes into account my data frame proportion rather than 2020 proportion.
library(dplyr)
library(forcats)
Sales.data.t <- data.frame(stringsAsFactors = FALSE,
Year = c(2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2019),
Model = c("cc", "aa", "gg", "cc", "bb", "bb", NA,
"cc", "cc", "bb"),
RType = c("H", "A", "A", "H", "B", "h", "A", "H",
"H", "B")
)
prop <- with(Sales.data.t, table(Model, Year)) %>%
prop.table(margin = 2)
prop
Sales.data.t <- Sales.data.t %>%
mutate(Main.Models = fct_lump(Model, prop = 0.4))
Sales.data.t
Can you help please?
Since you are already manually calculating the proportions, you could take the levels from there and use fct_other()
, see this example.
library(dplyr)
library(forcats)
Sales.data.t <- data.frame(stringsAsFactors = FALSE,
Year = c(2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2019),
Model = c("cc", "aa", "gg", "cc", "bb", "bb", NA,
"cc", "cc", "bb"),
RType = c("H", "A", "A", "H", "B", "h", "A", "H",
"H", "B")
)
keep_levels <- with(Sales.data.t, table(Model, Year)) %>%
prop.table(margin = 2) %>%
as_tibble() %>%
filter(Year == 2020, n >= 0.4) %>%
pull(Model)
Sales.data.t %>%
mutate(Main.Models = fct_other(Model, keep = keep_levels))
#> Year Model RType Main.Models
#> 1 2019 cc H cc
#> 2 2019 aa A Other
#> 3 2019 gg A Other
#> 4 2020 cc H cc
#> 5 2020 bb B bb
#> 6 2020 bb h bb
#> 7 2020 <NA> A <NA>
#> 8 2020 cc H cc
#> 9 2020 cc H cc
#> 10 2019 bb B bb
Created on 2019-10-26 by the reprex package (v0.3.0.9000)
1 Like
Slavek
October 28, 2019, 10:24am
3
andresrcs:
keep_levels <- with(Sales.data.t, table(Model, Year)) %>% prop.table(margin = 2) %>% as_tibble() %>% filter(Year == 2020, n >= 0.4) %>% pull(Model) Sales.data.t %>% mutate(Main.Models = fct_other(Model, keep = keep_levels))
Very clever, thank you!!!
system
Closed
November 4, 2019, 10:25am
4
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.