Calculating elapsed time in R given one column and taking care of day changes

I have the following data frame which gives a transaction id (Transaction) and time of transaction (transTime) for a retail outlet:

I am calculating time elapsed from one transaction to the next based on the following code:

mutate(elapsed = transTime - lag(transTime, default = first(transTime)))

However, this calculation becomes erroneous at the end of the day when a retail outlet closes shop at the end of the day and resumes business the next day. For example, the circled consecutive transactions, 536597 & 536598 highlight the moving from the end of the day on the 1st of December 2010 to the start of another day on the 2nd of December 2010 should not the used in the calculation of time elapsed between these two transactions as it would be meaningless as the retail outlet would have been closed and such a calculation would be meaningless and gives an erroneous very high elapsed time. The calculation should restart the next morning the retail shop opens, i.e. on the next day.

Is there a way of changing my code to take care of this?


Yes, you can use group_by() and calculate the time difference only within each day. E.g.

my_data %>%
    group_by(day = as.Date(Date)) %>%
    mutate(elapsed = transTime - lag(transTime, default = first(transTime))) %>%
1 Like

Noted with thanks, let me try it out.


This is a different question, for the sake of keeping the forum tidy, please ask it on a new topic providing a proper reprex (including sample data and library calls).

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.