I have a set of directories and subdirectories were a store files every day. I want to search in all directory and subdirectory for files that contains FRE at the start line, copy all the contents for the search results and then store its in a data frame. Usually i use findstr in windows command line. Can anybody figure out whether is it possible or not to do it in R?
For instances:
Directories and subdirectories:
c:/MyFiles/2019/082019/01082019
c:/MyFiles/2019/082019/02082019
c:/MyFiles/2019/082019/03082019
Files in folder 01082019:
- AB_01082019_120101.txt
- AB_01082019_121053.txt
- BD_01082019_132505.txt
Files in folder 02082019:
- AB_02082019_120301.txt
- AB_02082019_121102.txt
- BD_02082019_132408.txt
Fiels in folder 03082019:
- AB_03082019_120215.txt
- BD_03082019_120906.txt
- BD_03082019_132614.txt
Files content (BD_*.txt files):
line 1: FRE1234567894521234874654153123
line 2: GRE2545678145648454186164845416
line 3: FRE5612345315415343153415389123
Data Frame struture pretended:
File path, File Date, FRE Data
Line 1: c:/MyFiles/2019/082019/01082019/BD_01082019_132505.txt, 01082019, FRE1234567894521234874654153123
Line 2: c:/MyFiles/2019/082019/03082019/BD_03082019_132614.txt, 03082019, FRE5612345315415343153415389123
This will allow you to browse directories and use regex patterns to filter for the files you want. The resulting files (with full path if you specify that) will be stored in a list. In a next step you can paste the path into any reading function you like depending on the type of file (ex: read.table) to load the data into data frames.
Using mapping functions (like map from the purrr package), you can then load all files into a list of data frames without having to do it one by one. If all files have the same structure, you can even merge all data frames into one big frame using map_df (also from purrr).
Then, write a function to read some lines (n=??) to determine if "FRE" is present. Use that info to then proceed to fully read the appropriate files; at the same time saving the path and the file creation date:
?file.create
Save these input dataframes into a list; then use rbind or dplyr::bind_list to produce the final dataframe.