I use read_delim function of readr package to read a text file. The content of the text file is
9:00/ aaaaa
9:01/ bbbbb
9:04/ ccccc
9:07/ ddddd
12:06/ eeeee
12:13/ fffff
3:25/ ggggg
if I used the code
test1<-read_delim(paste0("./test1.txt"),delim="/",col_names=F)
then I get
test1
A tibble: 7 × 2
X1 X2
1 9:00 " aaaaa"
2 9:01 " bbbbb"
3 9:04 " ccccc"
4 9:07 " ddddd"
5 12:06 " eeeee"
6 12:13 " fffff"
7 3:25 " ggggg"
I want the first column type to be time. And I tried to use
test1a<-read_delim(paste0("./test1.txt"),delim="/",col_names=F,col_types = c("t","c"))
But I get a warning message
Warning message:
One or more parsing issues, see problems()
for details
test1a
A tibble: 7 × 2
X1 X2
1 09:00 " aaaaa"
2 09:01 " bbbbb"
3 09:04 " ccccc"
4 09:07 " ddddd"
5 NA " eeeee"
6 12:13 " fffff"
7 03:25 " ggggg"
could someone help me on this question?
1 Like
With the given text file content you provided, this is working for me:
library(readr)
library(dplyr)
result <- read_delim('test_file.txt', delim = '/', col_names = FALSE) |>
# import the time column as character, then transform using lubridate
mutate(col1 = lubridate::hm(X1))
#> Rows: 7 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "/"
#> chr (2): X1, X2
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
result
#> # A tibble: 7 × 3
#> X1 X2 col1
#> <chr> <chr> <Period>
#> 1 9:00 " aaaaa" 9H 0M 0S
#> 2 9:01 " bbbbb" 9H 1M 0S
#> 3 9:04 " ccccc" 9H 4M 0S
#> 4 9:07 " ddddd" 9H 7M 0S
#> 5 12:06 " eeeee" 12H 6M 0S
#> 6 12:13 " fffff" 12H 13M 0S
#> 7 3:25 " ggggg" 3H 25M 0S
Created on 2022-08-21 by the reprex package (v2.0.1)
Kind regards
looks like read_delim cannot read 12:06 as time correctly. Is it a bug? I know I can use additional steps to handle it, but just wonder why the col_type setting is not working.
Actually if I remove the record of 12:06, then I read the first column as time type correctly.
Exactly, I don't know why this strange thing happens. I also tried to specify col_types = list(X1 = col_time(format = '%H:%M'))
to use read_delim()
only, but since this occurs I would recommend using lubridate instead.
1 Like
Copy-pasting your file contents from your first message, RStudio adds a red dot in front of 12:06.
And indeed:
utf8ToInt('5 12:06 " eeeee"')
#> [1] 53 32 65279 49 50 58 48 54 32 34 32 101
#> [13] 101 101 101 101 34
utf8ToInt('5 12:06 " eeeee"')
#> [1] 53 32 49 50 58 48 54 32 34 32 101 101 101 101 101 34
In the first one, note that 65,279
, which is a zero-width space , so it is essentially invisible. Problem solved if you delete it.
system
Closed
September 4, 2022, 2:34am
7
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.