Calculate percentages from a variable

David_Suman · February 25, 2022, 10:02pm

Hi everyone, I'm brand new to R but am trying to use TidyCensus.
I'm trying to find a way to calculate percentages of each variable to a variable in the table.
Is there any way to do this?

dec_race_var <- get_decennial(
geography = "county",
state = "TX",
county = "Harris County",
year = 2020,
variables = c(
"P2_001N",
"P2_005N",
"P2_006N",
"P2_002N",
"P2_008N",
"P2_007N",
"P2_009N",
"P2_010N",
"P2_011N")
)
head(dec_race_var)

This is my current code, it give me my variables I need but I want to find the percentage of P2_001N variable for all other variables. I was trying to add another column with only the P2_001N variable and run a row calculation but had no luck with only adding a row full of the P2_001N variable.

This is what I'm aiming to create.

Thanks for any help!

Kyle_Walker · February 26, 2022, 1:31pm

You can use the summary_var argument in tidycensus to do this (read more about it here):

This will create a new denominator column based on a variable you've selected. Calculating percentages in a long-form (tidy) dataset is then straightforward with dplyr::mutate().

library(tidycensus)
library(tidyverse)

dec_race_var <- get_decennial(
  geography = "county",
  state = "TX",
  county = "Harris County",
  year = 2020,
  variables = c(
    White = "P2_005N",
    Black = "P2_006N",
    Hispanic = "P2_002N",
    Asian = "P2_008N",
    `American Indian / Alaska Native` = "P2_007N",
    `Native Hawaiian / Pacific Islander` = "P2_009N",
    `Other race` = "P2_010N",
    `Two or more races` = "P2_011N"),
  summary_var = "P2_001N"
) %>%
  mutate(percent = 100 * (value / summary_value))

system · March 19, 2022, 1:32pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.