I have 3 excel workbooks all containing population data for 3 different years. Each datasheet has location codes as the first column and then population data in the subsequent columns. I want to look at each data sheet, define a specific code and create a table for the total population for each year. Here is an example of the datasheet:
All 3 datasheets look the same. I have defined them as this:
pop_2016 <- read_excel("Wigan LSOA - Mid year 2016.xlsx", sheet="Persons")
pop_2017 <- read_excel("Wigan LSOA - Mid year 2017.xlsx", sheet="Persons")
pop_2018 <- read_excel("Wigan LSOA - Mid year 2018.xlsx", sheet="Persons")
So for each of pop_2016, pop_2017, pop_2018 I want to look in the column "2011 super" at a specific code (E01006283) and the total population associated with it in the "All ages" column".
This is my first time writing a script so I'm a total beginner.
So I have put in this as suggested:
rbind(pop_2016, pop_2017, pop_2018) %>% names()
select(2011_super_output_area_-lower_layer = 1, All_ages = 3) %>%
filter(2011_super_output_area-_lower_layer == "E01006283")
I'm now getting these errors:
rbind(pop_2016, pop_2017, pop_2018) %>% names()
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
select(2011_super_output_area_-lower_layer = 1, All_ages = 3) %>%
Error: unexpected input in " select(2011"
filter(2011_super_output_area_-lower_layer == "E01006283")
Error: unexpected input in " filter(2011"
I understand that you are reading in data which will have column/variable names whatever they might be.
For your own convenience though, wouldn't you prefer to use your code, to rename the variables on their way in, so when it comes to writing your code and doing your analysis, you have an easier time of things ? rename() is a function available in the tidyverse and dplyr packages, also as has been mentioned names() can be used to both get variables names and set them.
R does support variables with spaces, one would use the bactick symbol as a way to quote the start and end of such a name, but practically speaking it adds an extra layer of awkward. Its a case of everything being possible but somethings are more practical than others.
Often in my own code, I will use simple short descriptive names maybe with underscores to seperate.
Lots of output functions like charting ones, will let you define a name or label seperately to providing the variable to make the chart element out of, so that could be the time to reintroduce the long/pretty/name with spaces. In some cases though, I'd switch to the rename with ` backticks approach. It depends what you are dealing with.