Multiple variables in a custom value?

David_Suman · March 28, 2022, 2:04pm

Hi everyone,

I'm currently working on code for TidyCensus. I'm trying to pull data by block group to find statistics for target areas made of block groups. Here is a snippet of what I currently have.

report_year = 2020
report_geography = "block group"
report_geoid = 482015523031
report_state = "TX"
report_survey = "acs5"
export_name = "Blockgroup_2020_Snapshot.csv"

acs_lan_1 <- get_acs(
  geography = report_geography,
  state = report_state,
  year = report_year,
  survey = report_survey,
  summary_var = "C16001_001",
  variables = c(
    "C16001_002",
    "C16001_003",
    "C16001_012",
    "C16001_021",
    "C16001_027",
    "C16001_033",
    "C16001_018"))%>% 
  filter(GEOID == report_geoid)

acs_lan_1b <- acs_lan_1 %>% 
  mutate(value = ((estimate/1))) %>% 
  select(variable, estimate, value, summary_est) %>% 
  mutate(percent = (value/summary_est)) %>% 
  select(variable, value, percent)

acs_lan_1b[acs_lan_1b == "C16001_002"] <- "English Only"
acs_lan_1b[acs_lan_1b == "C16001_003"] <- "Spanish"
acs_lan_1b[acs_lan_1b == "C16001_012"] <- "Slavic Languages"
acs_lan_1b[acs_lan_1b == "C16001_021"] <- "Chinese"
acs_lan_1b[acs_lan_1b == "C16001_027"] <- "Tagalog"
acs_lan_1b[acs_lan_1b == "C16001_033"] <- "Arabic"
acs_lan_1b[acs_lan_1b == "C16001_018"] <- "Korean"

I am very new to R so I'm sure this code isn't as efficient as it could be but I was curious if there is a way to add multiple variables to my report_geoid value and include that in the code as my filter so I wont have to rerun the code for each block group in the target areas (some target areas include 20+ block groups).
The code I wrote is very long so I would prefer to not to do a set of code for each block group but might not have a choice.

Thanks for any help!

dvetsch75 · March 28, 2022, 4:30pm

I would use %in% in your filter condition. Then when you set up report_geoid you can set it up as a vector instead of just a single value:

report_geoid <- c(482015523031, 123456789123) # The second number is made up

Then, in your filter I would do this:

filter(GEOID %in% report_geoid)

Hope that helps.

David_Suman · March 28, 2022, 6:16pm

Thank you, works perfectly!
This may warrant another post but do you know of a way to calculate variables in a data frame with only the same variables?

Example seen below...

Is there any way to combine the variables with the same variable ID into one line?
I know I could do them individually and do a sum with the column but I would like to prevent making the code and calling the ACS data for every section.

dvetsch75 · March 28, 2022, 7:08pm

If I'm understanding you right, you want dplyr::group_by and dplyr::summarize. So if you wanted to combine all of the numeric columns into one row per GEOID, you might do something like:

df %>%
    group_by(GEOID) %>%
    summarize(
        across( # Use across to perform an operation on many columns at once
            estimate:summary_moe,
            sum
        )
    )

system · April 4, 2022, 7:09pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.