I think the problem is that the data in the Excel file is in the custom format '[h]:mm:ss', which doesn't seem to be easily read by read_excel
, which can only parse dates from Excel.
Here's solution that uses read_excel()
directly, but takes care of assigning the correct data types:
library(tidyverse)
library(readxl)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
# assuming below is result of:
# df <- read_excel([your excel file])
# dput(df)
df <-
structure(list(...1 = c(1, 2, 3, 4, 5, 6), t3 = structure(c(-2208958560,
NA, NA, NA, NA, -2208958560), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
t4 = structure(c(-2208958560, NA, NA, NA, NA, -2208958560
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), t5 = structure(c(-2208958560,
NA, NA, NA, NA, -2208958560), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), t8 = structure(c(-2208958560, NA, NA,
NA, NA, -2208958560), class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
### end of 'structure()' call
# save the date Excel inserts for custom [h]:mm:ss format
xl.origin <- '1900-01-01' %>% as_datetime()
# subtract xl.origin from column values and convert to duration
df %>%
# give first column a proper name
rename(row = ...1) %>%
mutate_at(
.vars = vars(t3:t8),
.funs = function(column) {(column - xl.origin) %>% as.duration()}
)
#> # A tibble: 6 x 5
#> row t3 t4 t5
#> <dbl> <Duration> <Duration> <Duration>
#> 1 1 30240s (~8.4 hours) 30240s (~8.4 hours) 30240s (~8.4 hours)
#> 2 2 NA NA NA
#> 3 3 NA NA NA
#> 4 4 NA NA NA
#> 5 5 NA NA NA
#> 6 6 30240s (~8.4 hours) 30240s (~8.4 hours) 30240s (~8.4 hours)
#> # … with 1 more variable: t8 <Duration>
# convert to numeric columns representing minutes, if desired
df %>%
# give first column a proper name
rename(row = ...1) %>%
mutate_at(
.vars = vars(t3:t8),
.funs = function(column) {(column - xl.origin) %>% as.duration()}
) %>%
mutate_all(~ as.numeric(., 'minutes'))
#> # A tibble: 6 x 5
#> row t3 t4 t5 t8
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 504 504 504 504
#> 2 2 NA NA NA NA
#> 3 3 NA NA NA NA
#> 4 4 NA NA NA NA
#> 5 5 NA NA NA NA
#> 6 6 504 504 504 504
Created on 2020-03-08 by the reprex package (v0.3.0)
Does this code make sense to you, @Traels?