read csv files all at once

supermarco · April 27, 2024, 1:47pm

hello
I have a strange pb reading many csv files on R. I cannot import all at once because the files have this structure in the unique column :

...
Source: Month 195911
Method used: Weight
Zone: Europe
Variables,"Total1","Total2","Total3"
=> Total,"354","8667","2118"
Area1,"109","85","67"
Area2,"89","68","52"
...

I want to keep only the title "Month 195911" as the column name and also all the lines on area (Area1...). My code is:

my_data <- import_list(dir("C:/Users/", pattern = ".csv"), rbind = TRUE)

My question is how to read all at once while holiding title line (1) and deleting useless lines(2-5) and import the remaining lines of all files? Many thanks.

DavoWW · May 3, 2024, 4:50am

Hi @supermarco
I'm not sure if you have solved this problem, but in case you haven't, here is some code you can try:

library(tidyverse)

# Get the list of CSV files to read
# Assumes all required files are in the current working directory with no others 
infiles <- list.files(pattern=".csv")
infiles

# Make a user-defined function to read one file
my_reader <- function(infile){
  # Read and edit line 1 from the file
  fileID <- readLines(con=infile, n=1)
  fileID <- str_split_i(fileID, ": Month ", 2)
  fileID <- str_remove(fileID, ",,,")  # Extra characters because I made CSV files
  print(fileID) # to show progress
  
  # Read the body of the data, fix column names, add fileID column
  body <- read.csv(infile, header=FALSE, skip=5)
  names(body) <- c("Area","Total1","Total2","Total3")
  body$fileID <- fileID
  return(body)
}

# Read all the files into a list
input.lst <- lapply(infiles, my_reader)

# bind all the list parts into a data.frame
input.df <- bind_rows(input.lst)
input.df

HTH

system · August 1, 2024, 4:50am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.