hy guys, i have a folder in my pc and i see the file that are in it with the function "list.files()"
so i have:
results <- list.files("../Tests/",recursive = T)
the class of results is "character", and in every position there is the file that i have in the folder
if i do a View(results) is something like this
[2]
101/Testler Export/801-Yurume Ileri/Test_1/340506.txt
[3]
101/Testler Export/801-Yurume Ileri/Test_1/340527.txt
[4]
101/Testler Export/801-Yurume Ileri/Test_1/340535.txt
[5]
101/Testler Export/801-Yurume Ileri/Test_1/340537.txt
[6]
101/Testler Export/801-Yurume Ileri/Test_1/340539.txt
[7]
101/Testler Export/801-Yurume Ileri/Test_1/340540.txt
[[8]
101/Testler Export/801-Yurume Ileri/Test_2/340506.txt
in each position there is a file txt..
I would like to write a "for " in which I say:
if (the file end with "340506" or " 340527")
read the files (that's fine with a "read.table ") and merge them, for example with "rbind ".
cderv
February 15, 2019, 7:50pm
2
You can use pattern =
argument in list.files
to only select the path that meets the pattern. Also fs
is another option, see example https://fs.r-lib.org/reference/dir_ls.html
Also you can consider purrr and map_df
function to read and row bind in the process.
This example with readxl to read several worksheets to one dataframe can help illustrate the process
https://readxl.tidyverse.org/articles/articles/readxl-workflows.html#concatenate-worksheets-into-one-data-frame
so something like this to complete with your case
fs::dir_ls(folder, regexp = ...) %>%
purrr::map_df(readr::read_csv)
1 Like
Applying what Christophe said to your own data, would look like this
library(tidyverse)
list_of_files <- list.files(path = "../Tests/",
recursive = TRUE,
pattern = "340506\\.txt$|340527\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
map_df(read_table)
And, as I said to you before please ask your questions with a REPR oducible EX ample (reprex)
3 Likes
sorry, but i'm a beginnner in R and i don't know how to use your answer without an example
the first code
list_of_files <- .....
it's ok
the second one, return this on the console
Parsed with column specification:
cols(
`// Start Time: 0` = col_character()
)
I think because I usually, for the file structure, I opened it with a
df <- read.table (file="340506", header = T, fill = T, skip = 4 )
how can i do the same thing with "read_table" ? (that you used)
It's pretty much the same, but you can also use read.table()
if you feel more comfortable with that.
library(tidyverse)
list_of_files <- list.files(path = "../Tests/",
recursive = TRUE,
pattern = "340506\\.txt$|340527\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
map_df(read.table, header = T, fill = T, skip = 4)
1 Like
sorry if I keep asking, but I'm not good at all.
ok.. know I have
list_of_files <- list.files(path = "../Tests/",
recursive = TRUE,
pattern = "340506\\.txt$|340527\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
map_df(read.table, header = T, fill = T, skip = 4)
the results of
list_of_files[1]
is
"../Tests/101/Testler Export/801-Yurume Ileri//Test_1/340535.txt"
i want to add to df, new columns to identifie subject, act and test; i have tried with
df$subject <- strsplit(list_of_files[1], "/")[[1]][3:3]
# return "101"
df$act <- strsplit(list_of_files[1], "/")[[1]][6:6]
# return "801-yureme ileri"
df$test <- strsplit(list_of_files[1], "/")[[1]][7:7]
# return " test_1"
so, if i want to add this new columns to the all file read with "read.table" where i can put a "for" like this?
list_of_files <- list.files(path = "../Tests/", header = TRUE,
pattern = "340535\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
map_df(read.table, header = T, fill = T, skip = 4)
for(i in 1: length(list_of_files)){
df$subject <- strsplit(list_of_files[i], "/")[[1]][3:3]
df$act <- strsplit(list_of_files[i], "/")[[1]][6:6]
df$test <- strsplit(list_of_files[i], "/")[[1]][7:7]
df$sensor <- strsplit(list_of_files[i], "/")[[1]][8:8]
}
cderv
February 16, 2019, 5:56pm
8
One way would be to do that inside the map
list_of_files <- list.files(path = "../Tests/", header = TRUE,
pattern = "340535\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
map_df( ~ {
file_path <- .x
tab <- read.table(.x, header = T, fill = T, skip = 4)
tab$subject <- strsplit(file_path, "/")[[1]][3:3]
tab$act <- strsplit(file_path, "/")[[1]][6:6]
tab$test <- strsplit(file_path, "/")[[1]][7:7]
tab$sensor <- strsplit(file_path, "/")[[1]][8:8]
tab
})
The other way would be to use dplyr to manipulate your resulting table. If you name list_of_files
, .id
argument in map_df
, would be useful to store the name in a column. if not you could store the index, then be able to subset list_of_files
with this index.
I would suggest reading
https://r4ds.had.co.nz/
to see how to manipulate some data with the tidyverse.
1 Like
Another way to do it using regular expressions and mutate
library(tidyverse)
library(stringr)
list_of_files <- list.files(path = "../Tests/",
recursive = TRUE,
pattern = "340506\\.txt$|340527\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
setNames(nm = .) %>%
map_df(read.table, header = T, fill = T, skip = 4, .id = "file_name") %>%
mutate(subject = str_extract(file_name, "(?<=Tests/)[0-9]+(?=/)"),
act = str_extract(file_name, "(?<=Export/).+(?=//)")
)
1 Like
is returned this error
Error in mutate_impl(.data, dots) :
Evaluation error: object 'file_name' not found.
Sorry, I forgot the .id argument
map_df(read.table, header = T, fill = T, skip = 4, .id = "file_name") %>%
This is exactly why we usually ask for a reproducible example, is hard to test the solution beforehand with out sample data.
andresrcs:
.id = "file_name"
ok... excuseme Andresrcs, butI'm in crisis.
I also posted another topic with the complete code https://forum.posit.co/t/error-in-data-frame-tmp-subject-value-replacement-has-1-row-data-has-0/24057 .. in the meantime I'm trying your code .. I hope it works
my code I think is not efficient at all .. but I'm not practical for R and I'm trying to do something working at least
your solution is very very fast, to read 12 milion of istances.
still gives me problems .. if I do not ask too much, you can look at the link I put, so as to have a general idea and on the line of what you wrote here, you give me a tip on how to proceed.
I'm sorry and thank you very much
dario_gd:
still gives me problems
What problems are you talking about? I sincerely don't understand what else are you needing, the solution that I already gave you is almost a direct substitute for your code in the other topic.
Could you elaborate a little more on your request?
can you write please also the code to add columns of " test" and "sensor"
these are two possible "list_of_files[i]
"../Tests//101/Testler Export/801-Yurume Ileri/Test_1/340535.txt"
"../Tests//102/Testler Export/811-Sandalye/Test_3/340535.txt"
i would in the first case
subject 101
act "801-Yurume Ileri"
test " test_3"
sensor "340535"
I have run the code you wrote to me, but in the column of subject it is NA ,
in the columns of act is not returned only "801-Yurume Ileri " , but "801-Yurume Ileri/Test_1"
than...
where I can study and learn how to use syntax like you did
(?<=Tests/)[0-9]+(?=/)
so that I can improve.
I do not want to be repetitive, but thank you very much
The file names that you gave last are different than the one you gave first, that is why the code was failing, now should work, at least for the examples you are giving.
library(stringr)
file_name <- c("../Tests/101/Testler Export/801-Yurume Ileri//Test_1/340535.txt",
"../Tests//101/Testler Export/801-Yurume Ileri/Test_1/340535.txt",
"../Tests//102/Testler Export/811-Sandalye/Test_3/340535.txt")
str_extract(file_name, "(?<=Tests//?)[0-9]+(?=/)")
#> [1] "101" "101" "102"
str_extract(file_name, "(?<=Export/)[^/]+(?=/+T)")
#> [1] "801-Yurume Ileri" "801-Yurume Ileri" "811-Sandalye"
str_extract(file_name, "(?<=/)[^/]+(?=/[:digit:]+.txt)")
#> [1] "Test_1" "Test_1" "Test_3"
str_extract(file_name, "(?<=/)[:digit:]+(?=.txt)")
#> [1] "340535" "340535" "340535"
Created on 2019-02-16 by the reprex package (v0.2.1)
So, this code should work with your data
library(tidyverse)
library(stringr)
list_of_files <- list.files(path = "../Tests/",
recursive = TRUE,
pattern = "340506\\.txt$|340527\\.txt$",
full.names = TRUE)
df <- list_of_files %>%
setNames(nm = .) %>%
map_df(read.table, header = T, fill = T, skip = 4, .id = "file_name") %>%
mutate(subject = str_extract(file_name, "(?<=Tests//?)[0-9]+(?=/)"),
act = str_extract(file_name, "(?<=Export/)[^/]+(?=/+T)"),
test = str_extract(file_name, "(?<=/)[^/]+(?=/[:digit:]+.txt)"),
sensor = str_extract(file_name, "(?<=/)[:digit:]+(?=.txt)")
)
What I'm using is called "Regular Expressions", if you're unfamiliar with them, I’d recommend starting at
1 Like
realy thanks. know i tried to study it.
ps : the code work
If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:
If your question has been answered, don't forget to mark the solution!
How do I mark a solution?
Find the reply you want to mark as the solution and look for the row of small gray icons at the bottom of that reply. Click the one that looks like a box with a checkmark in it:
[image]
Hovering over the mark solution button shows the label, "Select if this reply solves the problem". If you don't see the mark solution button, try clicking the three dots button ( ••• ) to expand the full set of options.
When a solution is chosen, the icon turns green and the hover label changes to: "Unselect if this reply no longer solves the problem". Success!
[solution_reply_author]
…
2 Likes
system
Closed
February 23, 2019, 9:52pm
19
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.