Select group of columns from an excel/csv file in R

I have a csv file of 100 columns. I would like to select columns in group of 4 and generate a new file. If it's not clear, have columns from 1-4 in a new file, then 4-8, 8-12 etc. Basically from 1 to 100 with 4 step size. How can I do it?

1 Like

Hi,

You could make this.
But maybe other advanced R user can make a for cycle for more fast.

library(dplyr)

df <- data.frame(replicate(20,sample(1:10,10,rep=TRUE)))

df_1 <- df %>% 
  select(X1 , X2 , X3 , X4)

df_2 <- df %>% 
  select(X5 , X6 , X7 , X8)

df_3 <- df %>% 
  select(X9, X10 , X11 , X12)

# in this form for other colums names

Thank you. The only problem is that I need it to be applicable to data frames of different dimensions, not just 10 for example. With the same process though

1 Like

The dimemsions of data frame no matter, it should work. :muscle:t3:

1 Like

Thank you. It worked with:

# import your original csv file
my_file <- read.csv("filename.csv")

# create an auxiliar list
colnumbers <- 1:100
colsplits <- split(colnumbers, ceiling(colnumbers/4))

# walk through it
purrr::iwalk(
  colsplits,
  ~ write.csv(my_file[, .x], paste0(.y, ".csv"))
)

After that, I created a list with list.files(pattern="*.csv") and
file_list<-lapply(list_of_files, read_csv, col_names = TRUE)
Do you know can I apply the same function to all the files in the list? I tried with function(df) but it doesn't work. I need to rename all the columns and concatenate two of them inside of all files

1 Like

Once you have read the smaller files with

list_of_files <- list.files(pattern = "\\.csv$")
file_list <- lapply(list_of_files, read_csv, col_names = TRUE)

you can use lapply on file_list to repeat the same operation on each 4-column data.frame, e.g.

column_sums <- lapply(file_list, colSums)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.