I would like to know if there is a way to get the mean of each states pop.size without having to write out each mean by creating a bunch of new data.frames.
Is is grouping the numbers together in the post but they are seperate. It is not all one number
To find the mean population of all of the counties in each state, use the
data %>%
group_by(group) %>%
summarize(mean_ctypop = mean(county_pop)
construction,
as in the toy example below with only 2 states and 5 counties
library(tidyverse) #load library
# now build vectors for the toy example
cty <- c("Autauga", "Baldwin", "Baldwin", "Washtenaw", "Wayne")
state <- c(rep("Alabama", 3), rep("Michigan", 2))
pop <- c(48612, 162586, 28414, 1753893, 370963)
# now assemble into a data frame from vectors
df <- data.frame(cty, state, pop)
df # show the dataframe df
# now run the pipe
df %>% # take the data frame, then
group_by(state) %>% # group_by state, then
summarize(mean_cty = mean(pop)) # summarize with a new variable, mean_cty
#> # A tibble: 2 x 2
#> state mean_cty
#> <fct> <dbl>
#> 1 Alabama 79871.
#> 2 Michigan 1062428
You will find that this construction of
data %>%
group_by(group) %>%
summarize(make a new variable)
That formatted code does look a lot better. To format the code in my response, I used the reprex package, which is very helpful for making reproducible examples for this kind of posting. Reprex takes code that you have copied and formats it nicely to your clipboard for later pasting to a message board like this one.