When I upload an excel file with data categorized as time in Excel, eg 12:10:33 PM, it shows up as 1899-12-31 12:10:33 in R. How do I remove "1899-12-31 " so it only shows time?
Read it as text and then convert it to another class as needed or perform data extraction later.
If you need more specific help, please provide a proper REPRoducible EXample (reprex) illustrating your issue.
Hi Thank you! I'm a beginner, so forgive me for ignorance:
How do I "convert it to another class as needed"? I can convert it to text in Excel, but not sure how to do the part in R
Not sure if the following format/info is appropriate, but I've included some more info below. You can see that dates show up as 1899-12-31 12:10:33. I'd like to only keep mm/dd/ss.
And if possible, I'd ideally like to combine date and time in excel into one cell, and then upload into R.
The data is located here: https://github.com/derek-l-thomas/time.data.git
library("readxl")
library("tidyverse")
#Create P3 total (p3_end_time - p3_begin_time)
mutate(driver_data, p3_total = p3_end_time - p3_begin_time)
A tibble: 90 x 9
p2_request_date p2_request_time p3_begin_date
1 2021-03-03 00:00:00 1899-12-31 12:10:33 2021-03-03 00:00:00
mutate(driver_data, p3_total = p3_end_time - p3_begin_time)
#> Error in mutate(driver_data, p3_total = p3_end_time - p3_begin_time): could not find function "mutate"
I think this example will show you how to perform the individual steps and then you can adapt it as needed
library(dplyr)
library(readxl)
library(hms)
# This part is only for loading the sample data, use your own file instead.
link <- "https://github.com/derek-l-thomas/time.data/raw/main/derek_sample.xlsx"
download.file(link, "driver_data.xlsx")
driver_data <- readxl::read_excel("driver_data.xlsx")
driver_data %>%
mutate(across(contains("time"), as_hms),
across(contains("date"), as.Date),
p2_time_stamp = as.POSIXct(paste(p2_request_date, p2_request_time))) %>%
select(p2_time_stamp, everything())
#> # A tibble: 90 x 9
#> p2_time_stamp p2_request_date p2_request_time p3_begin_date
#> <dttm> <date> <time> <date>
#> 1 2021-03-03 12:10:33 2021-03-03 12:10:33 2021-03-03
#> 2 2021-03-03 11:29:21 2021-03-03 11:29:21 2021-03-03
#> 3 2021-03-03 10:44:26 2021-03-03 10:44:26 2021-03-03
#> 4 2021-03-03 10:08:28 2021-03-03 10:08:28 2021-03-03
#> 5 2021-03-03 09:42:02 2021-03-03 09:42:02 2021-03-03
#> 6 2021-03-03 09:32:20 2021-03-03 09:32:20 2021-03-03
#> 7 2021-03-03 08:45:00 2021-03-03 08:45:00 2021-03-03
#> 8 2021-03-03 07:30:00 2021-03-03 07:30:00 2021-03-03
#> 9 2021-03-03 06:22:22 2021-03-03 06:22:22 2021-03-03
#> 10 2021-03-02 15:11:13 2021-03-02 15:11:13 2021-03-02
#> # … with 80 more rows, and 5 more variables: p3_begin_time <time>,
#> # p3_end_date <date>, p3_end_time <time>, trip <chr>,
#> # gross_orig_driver_pay <dbl>
Created on 2021-04-26 by the reprex package (v2.0.0)
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.