Forcats and fct_lump question

I have this dataframe:

df <- data.frame(
  stringsAsFactors = FALSE,
              name = c("PA2","PA1","PA4","PA5",
                       "PA6","PA7","PA8","PA3","PA9","PA10","PA11"),
            number = c(10417L,7436L,2552L,2660L,
                       2308L,2988L,2374L,198L,243L,143L,773L),
             elect = c(9L, 6L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 0L)
)

I want to embrace all the "name" with 0 "elect" in a new row with the tag "Others", like Others 1357 0 (1357 is the sum of all number with elect 0) and maintain the rest of df:

name number elect
PA2 10417 9
PA1 7436 6
PA4 2552 2
PA5 2660 2
PA6 2308 2
PA7 2988 2
PA8 2374 2
Others 1357 0

I'm try do it with forcats fct_lump, but I can't get it

Unlike data.frame( ), tibble( ) does not coerce characters to factors

library(tidyverse)

df <- tibble(
  name = c("PA2","PA1","PA4","PA5",
           "PA6","PA7","PA8","PA3","PA9","PA10","PA11"),
  number = c(10417L,7436L,2552L,2660L,
             2308L,2988L,2374L,198L,243L,143L,773L),
  elect = c(9L, 6L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 0L)
)
df |> 
  mutate(name = if_else(elect == 0, "Other", name)) |> 
  group_by(name) |> 
    mutate(number = sum(number)) |> 
  ungroup() |> 
  distinct()
#> # A tibble: 8 × 3
#>   name  number elect
#>   <chr>  <int> <int>
#> 1 PA2    10417     9
#> 2 PA1     7436     6
#> 3 PA4     2552     2
#> 4 PA5     2660     2
#> 5 PA6     2308     2
#> 6 PA7     2988     2
#> 7 PA8     2374     2
#> 8 Other   1357     0

## coming in {dplyr} 1.1.0: use .by instead of group_by() and then ungroup()
# df |> 
#   mutate(name = if_else(elect == 0, "Other", name)) |> 
#   mutate(number = sum(number), .by = name) |> 
#   distinct()

Created on 2023-01-19 with reprex v2.0.2

In the original datraframe, number and elect are numbers. Applying this solution to dataframe the new name column is messy, but this is not a serious problem, always can reorder it. Thanks EconProf.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.