Hi, I've just started to learn Rstudio and coding and I'm having some trouble with a few things. I'm trying to merge 20+ files into one data frame and add a column for subject number/ID which correspond to rows from each datafile.
e.g. all the data rows from file #1 would be labelled "1" in the subject column, etc.
All of the original data files already has a column labelled "subject". However, this column is blank (our experiment didn't output a subject number, but created a column called subject anyways), so there aren't any subject names in any of the original data files.
I tried implementing solutions from this thread, but I received an error that says "Error: file must be a string, raw vector or a connection."
I already read and merged all the data files using purrr:
Hi, thanks for the quick response.
I tried running the code, but I received this error:
Error: group_indices.default() should only be called in a data context
I'm unable to reproduce your error. Can you please only run list.files(path = "data", full.names = T) and tell me what is the output you see in the console?
For reading text files, you should generally use read_delim(), not read_csv(). Do you get a single data frame as output after running the map_dfr(...) statement?
Ah okay!
I ran up to the map_dfr() and I received a single data frame output which includes the new column! It's called "file_name" and provides the name of the the file "1, 2, etc" (which are numbered anyways).
Thank you!
Okay cool. I didn't know what your files were named, so the group_indices() would help if your files didn't contain a sequence number. Strange that the rest of the code doesn't work for you though.
Are you sure that the new column is called "file_name"? It should be "file_path" if you used the code I gave. I'd advise you to create a reprex so that we can see exactly what's happening by following this guide.