Hi all,
I have a few lines of code that include a column drop, a column rename, and a merge of two df. I need to repeat it 18 times. I have been trying to create a function to do it, but merge is not happy with the renamed column. it does not see it for some reason. I have actually written out the code 18 times for individual trials and everything works, but in the function i create it does not.
Any help is appreciated!
works:
enrollment2001 <- read.csv("C:\\Box Sync\\Name\\Data\\Schools\\enrollment\\cde_enrollment\\Batch579\\enrollment_2000-2001.csv", colClasses = c("CDS_CODE" = "character"))
frpm2001 <- read.csv("C:\\Box Sync\\Name\\Data\\Schools\\frpm\\frpm_2000-2018_clean1\\frpm2001_clean1.csv", colClasses = c("CDSCode" = "character", "year"= "NULL"))
#rename key column to be a shared name CDS_CODE
names(frpm2001)[names(frpm2001) == "CDSCode"] <- "CDS_CODE"
#merge enrollment with frpm
Merged01 <- merge(enrollment2001, frpm2001, by= "CDS_CODE", all.x= TRUE)
create a function to combine enrollment and frpm data
enrol_frpm<-function(year){
#read in the data
inputPath1 <- "C:\\Box Sync\\Name\\Data\\Schools\\enrollment\\cde_enrollment\\Batch579\\"
inputPath2 <- "C:\\Box Sync\\Name\\Data\\Schools\\frpm\\frpm_2000-2018_clean1\\"
inputPath3 <- "C:\\Box Sync\\Name\\Data\\Schools\\enrol_frpm\\"
enrol <- read.table(paste(inputPath1,"enrollment_",year-1,"-",year,".csv",sep=''),sep = '', fill = TRUE,header = TRUE, quote = "", colClasses = c("CDS_CODE" = "character"))
frpm <- read.csv(paste(inputPath2,"frpm",year, "_clean1", ".csv",sep = ''),sep = '', fill = TRUE,header = TRUE, quote = "", colClasses = c("CDSCode" = "character"))
#drop year column in frpm
frpm$year= NULL
#rename key column to be a shared name CDS_CODE
names(frpm) <- c("CDS_CODE"-)
#merge enrollment with frpm
Merged<- merge(enrol, frpm, by= `CDS_CODE`, all.x= TRUE)
#write out the data
write.csv(Merged, paste0(inputPath3,"enrol_frpm",year,".csv",sep=''), row.names = FALSE)
}
#define the years of interest
years <- c(2001:2018)
#run the previously defined function for each year
for (year in years) {
enrol_frpm(year)
}
Error in merge(enrol, frpm, by = CDS_CODE, all.x = TRUE) :
object 'CDS_CODE' not found
In addition: Warning messages:
1: In read.table(paste(inputPath1, "enrollment_", year - 1, "-", year, :
not all columns named in 'colClasses' exist
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
not all columns named in 'colClasses' exist