My function below works the way I want. I also want it to save the contents of filtered_xfile either to new objects, or to a list of objects (they are all different lengths so I'm not sure if that works).
I can't figure out how to save them with different names. For example fitlered_xfile1, filtered_xilfe2, filtered_xfile3. It prints them out just fine, but obviously overwrites when I save to an object and I just end up with the last xfile in the loop.
items <- c('HG13_2', 'GLAI18_2', 'HG15_2')
zscore_misfit <- function(data, xfile, item, sid) {
xfile <- xfile %>%
filter((xfile$ZSCORE > z_high | xfile$ZSCORE < z_low) &
xfile$`ITEM LABEL` == item)
filtered_xfile <<- xfile
data <<- data %>%
mutate("{item}" := replace(.data[[item]], .data[[sid]] %in% xfile$`PERSON LABEL`, 7774))
}
multiple_zscore <- function(xxx) {
for (i in 1:(length(xxx))) {
data <- zscore_misfit(data, xfile, xxx[i], "sidtp")
print(items[i])
print(filtered_xfile)
}
}
test <- multiple_zscore(items)
You're using the assign operator (<<-) to store object 'xfile' in the global environment as an object named 'filtered_xfile'. This process is repeated in every iteration of the for loop in multiple_zscore, each time overwriting the previous result. That's one of the hazards you'll need to watch out for when working with assign. Since your call to zscore_misfit happens within the function multiple_zscore, you could just do something like
multiple_zscore <- function(xxx){
# Allocate an empty list to store results
result_list <- list()
for (i in seq_len(length(xxx))){
data <- zscore_misfit(data, xfile, xxx[i], "sidtp")
# Note: data and xfile are not provided as arguments in the function call.
# Perhaps because they can be assumed to exist in the global environment
# during runtime. However, I'd recommend adding these as arguments to the
# function definition of multiple_zscore.
# e.g. multiple_zscore <- function(xxx, data, xfile, sid = "sidtp"){...}
# Store result from zscore_misfit in result_list in ith entry.
result_list[[i]] <- data
}
# Return list of results.
return(result_list)
}
If you really do need to have each of these objects in the global environment (following your naming convention filtered_xfile1, filtered_xfile2, ...), you could do
multiple_zscore <- function(xxx){
for (i in seq_len(length(xxx))){
data <- zscore_misfit(data, xfile, xxx[i], "sidtp")
new_name <- paste0("filtered_xfile",i)
# Store object 'data' under the name 'new_name' in the global environment.
assign(new_name, data)
}
}
Complete aside, rather than seq_len(length(x)) you can simply use seq_along(x).
If you are going to enter elements into a list from a for() loop, the best way to do that is to pre-allocate your result list and seq_along() that, e.g.
res <- vector(mode = "list", length(xxx))
for (i in seq_along(res)) {
# do stuff...
res[[i]] <- ....
}
Better, though would be to forego the explicit loop and simply use lapply().
res <- lapply(xxx, zscore_misfit, data = data, xfile = xfile, sid = "sidtp")
All of this is ignoring the fact that it would probably just be better and easier to rewrite the original function to be properly vectorized itself.