I have a csv file of 100 columns. I would like to select columns in group of 4 and generate a new file. If it's not clear, have columns from 1-4 in a new file, then 4-8, 8-12 etc. Basically from 1 to 100 with 4 step size. How can I do it?
Hi,
You could make this.
But maybe other advanced R user can make a for cycle
for more fast.
library(dplyr)
df <- data.frame(replicate(20,sample(1:10,10,rep=TRUE)))
df_1 <- df %>%
select(X1 , X2 , X3 , X4)
df_2 <- df %>%
select(X5 , X6 , X7 , X8)
df_3 <- df %>%
select(X9, X10 , X11 , X12)
# in this form for other colums names
Thank you. The only problem is that I need it to be applicable to data frames of different dimensions, not just 10 for example. With the same process though
The dimemsions of data frame no matter, it should work.
Thank you. It worked with:
# import your original csv file
my_file <- read.csv("filename.csv")
# create an auxiliar list
colnumbers <- 1:100
colsplits <- split(colnumbers, ceiling(colnumbers/4))
# walk through it
purrr::iwalk(
colsplits,
~ write.csv(my_file[, .x], paste0(.y, ".csv"))
)
After that, I created a list with list.files(pattern="*.csv") and
file_list<-lapply(list_of_files, read_csv, col_names = TRUE)
Do you know can I apply the same function to all the files in the list? I tried with function(df) but it doesn't work. I need to rename all the columns and concatenate two of them inside of all files
Once you have read the smaller files with
list_of_files <- list.files(pattern = "\\.csv$")
file_list <- lapply(list_of_files, read_csv, col_names = TRUE)
you can use lapply on file_list to repeat the same operation on each 4-column data.frame, e.g.
column_sums <- lapply(file_list, colSums)
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.