For demographic analysis of state I want to compare race and ethnicity and want to find out how many people are hispanic, non-hispanic and others

hsg.ppl2$XYZ <- for(i in 1:100)
if (hsg.ppl2$HISP[i] == 1 & hsg.ppl2$RAC1P[i] == 1) { "NHWht" } else if (hsg.ppl2$HISP[i] == 1 & hsg.ppl2$RAC1P[i] == 2) { "NHBlack" } else 
if (hsg.ppl2$HISP[i] == 1 & (hsg.ppl2$RAC1P[i] == 6 | hsg.ppl2$RAC1P[i] == 7)) {"NHAsian" } else
if (hsg.ppl2$HISP[i] == 1 & (hsg.ppl2$RAC1P[i] == 3 | hsg.ppl2$RAC1P[i] == 4 |hsg.ppl2$RAC1P[i] == 5 | hsg.ppl2$RAC1P[i] == 8 | hsg.ppl2$RAC1P[i] == 9)) {"NHOther" } else
if (hsg.ppl2$HISP[i] >= 1) {"HISP"} 
else {NA}

Hi priya53, I'd ask this kind of question with a minimal reprex. (FAQ: What's a reproducible example (`reprex`) and how do I do one?)

For the "find out how many people" question, check out dplyr's group_by + tally function. https://dplyr.tidyverse.org/reference/tally.html

For example,

# tally() is short-hand for summarise()
mtcars %>% tally()
#>    n
#> 1 32
mtcars %>% group_by(cyl) %>% tally()
#> # A tibble: 3 x 2
#>     cyl     n
#>   <dbl> <int>
#> 1     4    11
#> 2     6     7
#> 3     8    14
# count() is a short-hand for group_by() + tally()

mtcars %>% group_by(cyl) %>% tally() gives you how many observations mtcars has for each unique value of cyl.

There are two packages, tidycensus and acs that provide access to the US Census decennials and the annual estimates. acs will tell you how to get an Census API key and some guidance in forming queries.

With tidycensus, it's going to look like

# obtain 50-state ACS 2017 population estimates with geographic data
states_pop <- get_estimates("state", product = "population", geometry = TRUE, shift_geo = TRUE) %>% filter(variable == "POP") %>% filter(GEOID != 11) %>% mutate(geoid = as.integer(GEOID)) %>% select(GEOID, NAME, geoid, value) %>% mutate(POP_TOT = value) %>% select(-value)

# obtain 50_state ACS 2017 population estimates for white, non-Hispanic
# requires manual selection on ASC site to limit estimate to that ethnicity
states_ethnic <- get_acs(geography = "state", variables = "B01001H_001") %>% filter(GEOID != 11 & GEOID != 72) %>% select(GEOID, NAME, estimate) %>% mutate(WHITE = estimate) %>% select(-estimate)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.