Using append to add a column in between two columns to some (but NOT all) dataframes in a list

I have created a list that basically pulls in whatever contains .csv files there are within a specific folder on my computer. Among these files, I want to create a new column for all of the csv files that have 9 columns, so that all files have a total of 10 columns. How would I go about doing this to a list? I want to add the 10th column in the 7th column position, with the header "additional column." The purpose of doing this is so that all my files will have the same number of columns and can run the next sequence of code appropriately. I have attached an image of my list.

You can define a function that edits a data.frame and then use map to apply that function to your list. (In this code chunk, I also use map to read in the CSVs.

library(purrr)

list.filenames <- list.files(".", pattern = ".csv$")

list.data <- map(list.filenames, read.csv)

Define the function and use it in map (you can also write it into map directly, but I think to define functions with "if" outside of map.

edit_df <- function(data) {
  if( ncol(data) == 9 ) {
    data$another_column <- NA
    
    data <- data[ , c(1:6, 10, 7:9) ] # order columns so the new column is in the 7th place
  }
  
  return(data)
  } 

list.data.edit <- map(list.data, edit_df)
str(list.data.edit)

List of 5
 $ :'data.frame':	5 obs. of  10 variables:
  ..$ a  : int [1:5] 1 2 3 4 5
  ..$ b  : int [1:5] 1 2 3 4 5
  ..$ c  : int [1:5] 1 2 3 4 5
  ..$ d  : int [1:5] 1 2 3 4 5
  ..$ e  : int [1:5] 1 2 3 4 5
  ..$ f  : int [1:5] 1 2 3 4 5
  ..$ g  : int [1:5] 1 2 3 4 5
  ..$ f.1: int [1:5] 1 2 3 4 5
  ..$ h  : int [1:5] 1 2 3 4 5
  ..$ i  : int [1:5] 1 2 3 4 5
 $ :'data.frame':	5 obs. of  10 variables:
  ..$ a             : int [1:5] 1 2 3 4 5
  ..$ b             : int [1:5] 1 2 3 4 5
  ..$ c             : int [1:5] 1 2 3 4 5
  ..$ d             : int [1:5] 1 2 3 4 5
  ..$ e             : int [1:5] 1 2 3 4 5
  ..$ f             : int [1:5] 1 2 3 4 5
  ..$ another_column: logi [1:5] NA NA NA NA NA
  ..$ g             : int [1:5] 1 2 3 4 5
  ..$ h             : int [1:5] 1 2 3 4 5
  ..$ i             : int [1:5] 1 2 3 4 5
 $ :'data.frame':	5 obs. of  10 variables:
  ..$ a  : int [1:5] 1 2 3 4 5
  ..$ b  : int [1:5] 1 2 3 4 5
  ..$ c  : int [1:5] 1 2 3 4 5
  ..$ d  : int [1:5] 1 2 3 4 5
  ..$ e  : int [1:5] 1 2 3 4 5
  ..$ f  : int [1:5] 1 2 3 4 5
  ..$ g  : int [1:5] 1 2 3 4 5
  ..$ f.1: int [1:5] 1 2 3 4 5
  ..$ h  : int [1:5] 1 2 3 4 5
  ..$ i  : int [1:5] 1 2 3 4 5
 $ :'data.frame':	5 obs. of  10 variables:
  ..$ a             : int [1:5] 1 2 3 4 5
  ..$ b             : int [1:5] 1 2 3 4 5
  ..$ c             : int [1:5] 1 2 3 4 5
  ..$ d             : int [1:5] 1 2 3 4 5
  ..$ e             : int [1:5] 1 2 3 4 5
  ..$ f             : int [1:5] 1 2 3 4 5
  ..$ another_column: logi [1:5] NA NA NA NA NA
  ..$ g             : int [1:5] 1 2 3 4 5
  ..$ h             : int [1:5] 1 2 3 4 5
  ..$ i             : int [1:5] 1 2 3 4 5
 $ :'data.frame':	5 obs. of  10 variables:
  ..$ a  : int [1:5] 1 2 3 4 5
  ..$ b  : int [1:5] 1 2 3 4 5
  ..$ c  : int [1:5] 1 2 3 4 5
  ..$ d  : int [1:5] 1 2 3 4 5
  ..$ e  : int [1:5] 1 2 3 4 5
  ..$ f  : int [1:5] 1 2 3 4 5
  ..$ g  : int [1:5] 1 2 3 4 5
  ..$ f.1: int [1:5] 1 2 3 4 5
  ..$ h  : int [1:5] 1 2 3 4 5
  ..$ i  : int [1:5] 1 2 3 4 5

This should work for your specific use-case but probably isn't well adapted to other uses. If the end result is one data frame consisting of data from all the CSVs you can use map_dfr to bind each dataset into one.

all_csvs <- map_dfr(list.filenames, read.csv)
str(all_csvs)

'data.frame':	25 obs. of  10 variables:
 $ a  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ b  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ c  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ d  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ e  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ f  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ g  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ f.1: int  1 2 3 4 5 NA NA NA NA NA ...
 $ h  : int  1 2 3 4 5 1 2 3 4 5 ...
 $ i  : int  1 2 3 4 5 1 2 3 4 5 ...
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.