Hi! I'm pretty new to R. Enjoying it immensely.
I'm reading in all CSV files from a directory using map_dfr
to apply read_csv
over a list of filenames. The CSV files have a varying number of columns. I only want to import 1:7 and to discard 8: onwards where they exist. All files have a column 8, and some files have some text that is parsed as columns 9, 10 etc when I look at the files in Excel. I don't care about any of these columns. Notably, these extra columns don't have headers.
My code is:
df_csv <- map_dfr(
csvpaths,
read_csv,
skip_empty_rows = TRUE,
col_names = TRUE,
col_select = 1:7,
col_types = cols_only(
Project = col_character(),
Date = col_character(),
Employee = col_character(),
Role = col_character(),
Rate = col_double(),
Hours = col_double(),
Amount = col_double()
),
.id = "Source"
)
What's happening:
- The files are being read in correctly, in that the final df includes the correct information AFAIK.
- In my RStudio console, read_csv is now outputting what looks like a time?
0s
Why? I think this started showing up after my last package update. - In my latest test of 15 files, I received 8 warnings (the
0s
time is output after the start of the warning message - unsure why?):
Warning messages: 0s
1: One or more parsing issues, see `problems()` for details
2: One or more parsing issues, see `problems()` for details
3: One or more parsing issues, see `problems()` for details
4: One or more parsing issues, see `problems()` for details
5: One or more parsing issues, see `problems()` for details
6: One or more parsing issues, see `problems()` for details
7: One or more parsing issues, see `problems()` for details
8: One or more parsing issues, see `problems()` for details
I'm pretty sure that the warnings relate to instances where a CSV file has more than 8 columns. But!
I can't get any output from problems()
so I can't tell what is happening.
> problems()
>
It doesn't seem to matter if I restrict the map_dfr call to just a single filename; I still can't view any output from problems
.
read_csv
IS respecting the argument to select only columns 1:7 in the read, but it ISN'T stopping errors from being created from files which have more than 7 columns, which I thought was the purpose of using col_select
in the first place.
How can I get these warnings sorted? I previously wrapped this in suppressWarnings
but realised it was masking some other real parsing errors I needed to fix, which I've now done.