Hi, I have these columns that in my excel are hours and minutes (for example, '3:00:00') but R imported them incorrectly as dates (following the example above, '1899-12-31 03:00:00:00').
I need to extract the hours and minutes correctly so that I can do calculations with them.
I saw this way
format(as.POSIXct(.), format = "%H:%M")
but it doesn't work for me because I need to do calculations later and be able to calculate statistical summaries.
I saw that lubridate also has duration() that allows me to calculate how many minutes in total are but first I need to get the interval and I don't know how to do it.
Ways to calculate with time-related objects depend on converting a string representation, such as "3:00:00" to either an integer directly, by parsing the string and multiplying its components by 24, 60 and 60 and converting to numeric, then adding them together to get seconds or manipulating them to get fractional minutes or hours. The other way is to convert to a datetime object. For that you need to tack on a date component.
as_is <- "3:00:00" # imported
add_date <- "2023-03-23" # add an arbitrary date
date_string <- paste(add_date,as_is) # make a date time string
o <- lubridate::ymd_hms(date_string) # datetime object
d <- data.frame(event1 = o) # add to data frame
# repeat for second object
as_is <- "4:34:23" # imported
add_date <- "2023-03-23" # add an arbitrary date
date_string <- paste(add_date,as_is) # make a date time string
o <- lubridate::ymd_hms(date_string) # datetime object
d$event2 <- o
e <- d[2] - d[1]
str(e)
#> 'data.frame': 1 obs. of 1 variable:
#> $ event2: 'difftime' num 1.57305555555556
#> ..- attr(*, "units")= chr "hours"
# convert to decimal if desired
as.numeric(e[[1]])
#> [1] 1.573056
I have one more question, there is a way to change the decimal system from output to time? I mean, instead of 0.75 to show 0.45 and instead of 1.5 to show 1.3. So with the results of statistical calculations I don't have to convert it from decimal system to minutes.
This is not how I need it. I need the unit to be hours, not minutes. that's why in the examples that I gave you 1.5 I need it to be 1.3 and not 90. 45 minutes is 0.45 and not 45
I dont see how going from a straightforward decimal representation of hours like 1.5 hours to a mixed unit where before the decimal is hours and after the decimal is minutes a la 1.3 can possibly help with calculations, surely its definitely less helpful for doing calculations ??
I'm trying to understand, what you did was that the add_time column is the sum of aa and bb? and the cc column? and then how do you do statistical calculations with that column?
It's not really a mixed representation, it's just that the times are represented as they are in real life. An efficient way to average 2:30 and 3:00 and get 2:45.
It will always be easier to do some calculations and the ouput will be the desired one and not the output will be an intermediate step to the desired result. Or do you measure time with the decimal system and not with the sexagesimal system on a day-to-day basis?