How to find contents in files within the sub directories and store in data frame

I have a set of directories and subdirectories were a store files every day. I want to search in all directory and subdirectory for files that contains FRE at the start line, copy all the contents for the search results and then store its in a data frame. Usually i use findstr in windows command line. Can anybody figure out whether is it possible or not to do it in R?

For instances:
Directories and subdirectories:

Files in folder 01082019:
- AB_01082019_120101.txt
- AB_01082019_121053.txt
- BD_01082019_132505.txt
Files in folder 02082019:
- AB_02082019_120301.txt
- AB_02082019_121102.txt
- BD_02082019_132408.txt
Fiels in folder 03082019:
- AB_03082019_120215.txt
- BD_03082019_120906.txt
- BD_03082019_132614.txt

Files content (BD_*.txt files):

  • line 1: FRE1234567894521234874654153123
  • line 2: GRE2545678145648454186164845416
  • line 3: FRE5612345315415343153415389123

Data Frame struture pretended:
File path, File Date, FRE Data
Line 1: c:/MyFiles/2019/082019/01082019/BD_01082019_132505.txt, 01082019, FRE1234567894521234874654153123
Line 2: c:/MyFiles/2019/082019/03082019/BD_03082019_132614.txt, 03082019, FRE5612345315415343153415389123


You might want to take a look at the list.files function.

This will allow you to browse directories and use regex patterns to filter for the files you want. The resulting files (with full path if you specify that) will be stored in a list. In a next step you can paste the path into any reading function you like depending on the type of file (ex: read.table) to load the data into data frames.

Using mapping functions (like map from the purrr package), you can then load all files into a list of data frames without having to do it one by one. If all files have the same structure, you can even merge all data frames into one big frame using map_df (also from purrr).

Hope this helps,

Yes, its definitely possible (this is R after all)!
To get all possible files use something like this (from the top directory):

my_list <- list.files(pattern = ".txt", recursive = TRUE)

Then, write a function to read some lines (n=??) to determine if "FRE" is present. Use that info to then proceed to fully read the appropriate files; at the same time saving the path and the file creation date:


Save these input dataframes into a list; then use rbind or dplyr::bind_list to produce the final dataframe.


This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.