nchan08
November 16, 2021, 12:51am
1
I am new to RStudio. I am trying to import the following text into Rstudio. The file type is .txt. I would like to remove the first 6 rows (all the text).
AAA: true
BBB: false
CCC: 0
DDD: Wavelengths
EEE: 3648
>>>>>Begin Decimal Data<<<<<
623.821 -216.03
623.881 -216.03
623.941 -216.03
Then I would like to create the data.table, separating into two columns: Col A and Col B as follow.
Col A Col B
623.821 -216.03
623.881 -216.03
623.941 -216.03
I have a series of .txt files. So, I would like to combine, and delete the rows and separate the columns. I coded as follow:
files <- dir(".", pattern = ".txt$")
for (i in 1:length(files)) { obj_name <- files %>% str_sub(start = -2) assign(obj_name[i], read_table(files[i], skip = 13)) names(obj_name[i]) <- c("Col_A", "Col_B") }
But it shows the error as follow:
-- Column specification -------------------------------------------------------------
cols(
`>>>>>Begin` = col_double(),
Spectral = col_double(),
`Data<<<<<` = col_character()
)
Warning: 3648 parsing failures.
row col expected actual file
1 -- 3 columns 2 columns 'L1_0401_1758_C.txt'
2 -- 3 columns 2 columns 'L1_0401_1758_C.txt'
3 -- 3 columns 2 columns 'L1_0401_1758_C.txt'
4 -- 3 columns 2 columns 'L1_0401_1758_C.txt'
5 -- 3 columns 2 columns 'L1_0401_1758_C.txt'
... ... ......... ......... ....................
See problems(...) for more details.
Perhaps something like read_tsv? There is an argument to skip rows:
read_csv() and read_tsv() are special cases of the more general
read_delim(). They're useful for reading the most common types of
flat file data, comma separated values and tab separated values,
respectively. read_csv2() uses ; for the field...
nchan08
November 16, 2021, 2:16am
3
Thanks, @williaml ! I tried using read_tsv() as well. But I can't get the expected result, especially skip function doesn't work, even though it works to read the text file.
All the text are still including in the results, but the columns seem to be a single column.
Data.from.HR4D5021_005237.txt.Node
1 AAA: true
2 BBB: false
3 CCC: 0
4 DDD: Wavelengths
5 EEE: 3648
6 >>>>>Begin Decimal Data<<<<<
7 623.821
8 -216.03
9 623.881
10 -216.03
11 623.941
12 -216.03
nchan08
November 16, 2021, 2:54am
4
When I do on a single file, read_delim() seems to work:
Test1111 <- read.delim("C:/L1_0401_0603_B.txt", header = FALSE, skip = 14)
But when I worked on a series of file, it doesn't work again.
files <- dir(".", pattern = ".txt$")
for (i in 1:length(files)) {
obj_name <- files %>% str_sub(start = -2)
assign(obj_name[i], read_delim(files[i]))
read_delim(files[i], header = FALSE, skip=14)
}
Error:
Rows: 3660 Columns: 4
-- Column specification -------------------------------------------------------------
Delimiter: " "
chr (4): Data, from, HR4D5021_005094.txt, Node
i Use spec()
to retrieve the full column specification for this data.
i Specify the column types or set show_col_types = FALSE
to quiet this message.
Error in read_delim(files[i], header = FALSE, skip = 14) :
unused argument (header = FALSE)
What's wrong with this code?
Hi @nchan08 , you can try the following code. Hopping this will help you.
these code for "|" pipe separated .txt files, you can use any other separator by using "delim = < separator >" .
Path where files are located
paths <- dir(getwd())
Names
names(paths) <- basename(paths)
Read and combine all the files
DataSet<-
ldply(paths, read_delim,delim = "|",escape_double = FALSE,
col_names = c("Col1","Col2","Col3","Col4","Col5"),
col_types = cols(Col1 = col_character(),
Col2 = col_date(format = "%d/%m/%Y")),
trim_ws = TRUE, skip = 6) # use "skip =6 to skip the first 6 rows".
system
Closed
December 7, 2021, 5:44am
6
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.