Create new objects from a for loop

My function below works the way I want. I also want it to save the contents of filtered_xfile either to new objects, or to a list of objects (they are all different lengths so I'm not sure if that works).

I can't figure out how to save them with different names. For example fitlered_xfile1, filtered_xilfe2, filtered_xfile3. It prints them out just fine, but obviously overwrites when I save to an object and I just end up with the last xfile in the loop.

items <- c('HG13_2', 'GLAI18_2', 'HG15_2')

zscore_misfit <- function(data, xfile, item, sid) {
  xfile <- xfile %>%
    filter((xfile$ZSCORE > z_high | xfile$ZSCORE < z_low) & 
             xfile$`ITEM LABEL` == item)
  
  filtered_xfile <<- xfile
  
  data <<- data %>% 
    mutate("{item}" := replace(.data[[item]], .data[[sid]] %in% xfile$`PERSON LABEL`, 7774))
}

multiple_zscore <- function(xxx) {
  for (i in 1:(length(xxx))) {
    data <- zscore_misfit(data, xfile, xxx[i], "sidtp")
    print(items[i])
    print(filtered_xfile)
  }
}

test <- multiple_zscore(items)

You're using the assign operator (<<-) to store object 'xfile' in the global environment as an object named 'filtered_xfile'. This process is repeated in every iteration of the for loop in multiple_zscore, each time overwriting the previous result. That's one of the hazards you'll need to watch out for when working with assign. Since your call to zscore_misfit happens within the function multiple_zscore, you could just do something like

multiple_zscore <- function(xxx){
  # Allocate an empty list to store results
  result_list <- list()

  for (i in seq_len(length(xxx))){
    data <- zscore_misfit(data, xfile, xxx[i], "sidtp")

    # Note: data and xfile are not provided as arguments in the function call.
    # Perhaps because they can be assumed to exist in the global environment
    # during runtime. However, I'd recommend adding these as arguments to the 
    # function definition of multiple_zscore.
    # e.g. multiple_zscore <- function(xxx, data, xfile, sid = "sidtp"){...}

    # Store result from zscore_misfit in result_list in ith entry.
    result_list[[i]] <- data
  }
  # Return list of results.
  return(result_list)
}

If you really do need to have each of these objects in the global environment (following your naming convention filtered_xfile1, filtered_xfile2, ...), you could do

multiple_zscore <- function(xxx){
  for (i in seq_len(length(xxx))){
    data <- zscore_misfit(data, xfile, xxx[i], "sidtp")
    new_name <- paste0("filtered_xfile",i)
    
    # Store object 'data' under the name 'new_name' in the global environment.
    assign(new_name, data)
  }
}

Complete aside, rather than seq_len(length(x)) you can simply use seq_along(x).

If you are going to enter elements into a list from a for() loop, the best way to do that is to pre-allocate your result list and seq_along() that, e.g.

res <- vector(mode = "list", length(xxx))
for (i in seq_along(res)) {
    # do stuff...
    res[[i]] <- ....
}

Better, though would be to forego the explicit loop and simply use lapply().

res <- lapply(xxx, zscore_misfit, data = data, xfile = xfile, sid = "sidtp")

All of this is ignoring the fact that it would probably just be better and easier to rewrite the original function to be properly vectorized itself.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.