Hi, I am trying to recode race/ethnicity variable, so it has 3 categories Majority, Minority and the rest NA for a given country. I started using Ifelse and mutate, but was having trouble coming up with a conditional statement to specify the most numerous racial group per country. I appreciate any help.
The output of dput(head(df,20)) would have been more convenient than the output of glimpse. I constructed a toy data set for an example that I hope will give you what you need. I first calculate the ethnic group that appears most commonly for each country. Then for each row of the original data, I append a column showing which ethnic group is most common for that country. If the original ethnic group matches the most common one, I label that row Dominant. Otherwise the label in Minority.
Note that if two ethnic groups have exactly the same number of members in a country, both groups will be appended in the left_join and you will have to decide how to deal with that. Check how many rows your data frame has originally and after running the code to see if this happened. You will get extra rows if two ethnic groups were equal.
I included several steps where I print out the intermediate data frames to make it clearer how the code works. Those steps are not necessary.
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.1.2
DF <- data.frame(Country = c("A", "A", "A", "A",
"B", "B", "B", "B"),
EthnicGroup = c("Q", "W", "Q", "Q",
"Q", "E", "E", "E"))
DF
#> Country EthnicGroup
#> 1 A Q
#> 2 A W
#> 3 A Q
#> 4 A Q
#> 5 B Q
#> 6 B E
#> 7 B E
#> 8 B E
MaxEthnic <- DF |> group_by(Country, EthnicGroup) |>
summarize(N = n()) |>
slice_max(order_by = N)
#> `summarise()` has grouped output by 'Country'. You can override using the `.groups` argument.
MaxEthnic
#> # A tibble: 2 x 3
#> # Groups: Country [2]
#> Country EthnicGroup N
#> <chr> <chr> <int>
#> 1 A Q 3
#> 2 B E 3
DF <- left_join(DF, MaxEthnic, by = "Country")
DF
#> Country EthnicGroup.x EthnicGroup.y N
#> 1 A Q Q 3
#> 2 A W Q 3
#> 3 A Q Q 3
#> 4 A Q Q 3
#> 5 B Q E 3
#> 6 B E E 3
#> 7 B E E 3
#> 8 B E E 3
DF <- DF |> mutate(Dom_Minor = ifelse(EthnicGroup.x == EthnicGroup.y, "Dominant", "Minority"))
DF
#> Country EthnicGroup.x EthnicGroup.y N Dom_Minor
#> 1 A Q Q 3 Dominant
#> 2 A W Q 3 Minority
#> 3 A Q Q 3 Dominant
#> 4 A Q Q 3 Dominant
#> 5 B Q E 3 Minority
#> 6 B E E 3 Dominant
#> 7 B E E 3 Dominant
#> 8 B E E 3 Dominant