Apply function to specific columns in list of data frame, simultaneously

SeaSA · May 6, 2022, 12:29pm

I am working on cleaning occurrence data and I want to check for duplicates of occurrence records within raster grid cells. First, I have multiple species files (>100), so I read each file separately in a dataframe list and I now have a single dataframe with a list of 100 files.

Next step is to have a raster layer which I will use to create new values to pixels., assigning each with a unique value (ID). The goal is to later extract these IDs for occurrence data.

If I read in a single species file and run my code. It is perfect. I want to be able to run the dataframe containing the list of 100 files at once. But I'm not sure of the correct way to do this.

The code I used for running a single species file is;

DF= read.csv(".../NewFolder/Themeda_triandra.csv", sep = ",", header = TRUE)

rastID = raster(".../Rasters/30s_bio/wc2.1_30s_bio_15.tif")
rastID[] = 1:ncell(rastID) 
occRastID= raster::extract(rastID, cbind(DF$decimalLongitude, DF$decimalLatitude))
summary(duplicated(occRastID))
cleanOccNoDup = DF[!duplicated(occRastID),]

In attempting to do this for the entire list of files in my dataframe, I tried;

spDat = list()
sppFiles = list.files(".../NewFolder/")
sppNames = unlist(strsplit(sppFiles, split='.csv'))
summaryDatL = list(sppNames)

 for(s in 1:length(sppNames)){
   spDat[[s]] = read.csv(paste0(".../NewFolder/",sppNames[s],".csv"), sep=",",header=TRUE)
 }

rastID = raster(".../Rasters/30s_bio/wc2.1_30s_bio_15.tif")
rastID[] = 1:ncell(rastID) 
occRastID= raster::extract(rastID, cbind(spDat, "[", c("decimalLongitude","decimalLatitude")))
summary(duplicated(occRastID))
cleanOccNoDup = DF[!duplicated(occRastID),]

In the second attempt, I know I am doing something wrong with the brackets, just not sure how to solve this. Please Help!

Also, once the occurrence records have been cleaned, I want to keep these files as a list in a dataframe too.

Sanjmeh · May 6, 2022, 12:36pm

Do a print of filenames inside the for loop to diagnose if you are getting the correct paths constructed. We cannot see your directory structure hence can only guess. Remove all code and just focus on getting the file paths correct first.

Not sure why you use tripe dots (...). Which OS is this?

system · May 27, 2022, 12:37pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.