I think you are missing some code in the message. The code block below i tried adding is blank.
as.difftime instead of just difftime
Oh, okay. My reading of the help is that is that it is intended to translate other date formats in other packages into {lubridate} format so "NA" is what you would expect here.
Are you still getting this error?
Error in as.POSIXlt.character(x, tz, ...) :
character string is not in a standard unambiguous format.
It looks like it is telling us that it thinks that one or both of
all_trips$started_at
all_trips$ended_at
is not in standard POSIXct
format. Weird' as my version is working fine.
Try
class(all_trips$started_at)
class(all_trips$ended_at)
In both cases the output should be:
"POSIXct" "POSIXt"
Yes that's what I am getting
Can you run the script again and copy all the error message(s) again between the
```
```
Maybe a cleaner layout may help.
Also try
all_trips$ended_at[1]
all_trips$started_at[1]
I get
> all_trips$ended_at[1]
[1] "2019-01-01 00:11:07 UTC"
> all_trips$started_at[1]
[1] "2019-01-01 00:04:37 UTC"
Error in class(all_trips$ended_at, all_trips$started_at) :
2 arguments passed to 'class' which requires 1
Error: unexpected symbol in:
"
all_trips"
Error: unexpected symbol in:
"all_trips$ride_length <- difftime(all_trips$ended_at[1]
all_trips"
thats the error I am getting doing
all_trips$ride_length <- difftime(all_trips$ended_at[1], all_trips$started_at[1])
on here I noticed when I add a delimiter (```), the code chunk wouldn't show, hence my copying and pasting as it is.
When all else fails, read the manual. I just realized that difftime is base R not (lubridate). That does not explain why your script is working and mine appears to be but it may be relevant. It occurs to me I have not actually checked the accuracy of what I am getting.
I have to go out for two or three hours. If all you need is ride duration, I wonder if this would do as a work-around for the moment.
Example
t1 <- ymd_hms("2024-04-25 10:00:00")
t2 <- ymd_hms("2024-04-25 14:15:00")
(t2 - t1)
Time difference of 4.25 hours
Therefore
(t2 - t1) %>% as.numeric
4.25
Thanks so much for all your effort.
I actually want "ride_length" calculation to all_trips (in seconds).
hence the:
code: all_trips$ride_length <- difftime(all_trips$ended_at,
all_trips$started_at)
but if I write same code as : all_trips$ride_length <- as.difftime(all_trips$ended_at,
all_trips$started_at)
inserting as.difftime, it runs well and returns
tibble [791,956 × 15] (S3: tbl_df/tbl/data.frame)
ride_id : chr [1:791956] "21742443" "21742444" "21742445" "21742446" ...
started_at : chr [1:791956] "2019-01-01 0:04:37" "2019-01-01 0:08:13" "2019-01-01 0:13:23" "2019-01-01 0:13:45" ...
ended_at : chr [1:791956] "2019-01-01 0:11:07" "2019-01-01 0:15:34" "2019-01-01 0:27:12" "2019-01-01 0:43:28" ...
rideable_type : chr [1:791956] "2167" "4386" "1524" "252" ...
start_station_id : num [1:791956] 199 44 15 123 173 98 98 211 150 268 ...
start_station_name: chr [1:791956] "Wabash Ave & Grand Ave" "State St & Randolph St" "Racine Ave & 18th St" "California Ave & Milwaukee Ave" ...
end_station_id : num [1:791956] 84 624 644 176 35 49 49 142 148 141 ...
end_station_name : chr [1:791956] "Milwaukee Ave & Grand Ave" "Dearborn St & Van Buren St ()" "Western Ave & Fillmore St ()" "Clark St & Elm St" ...
member_casual : chr [1:791956] "member" "member" "member" "member" ...
date : Date[1:791956], format: "2019-01-01" ...
month : chr [1:791956] "01" "01" "01" "01" ...
day : chr [1:791956] "01" "01" "01" "01" ...
year : chr [1:791956] "2019" "2019" "2019" "2019" ...
day_of_week : chr [1:791956] "Tuesday" "Tuesday" "Tuesday" "Tuesday" ...
$ ride_length : 'difftime' num [1:791956] NA NA NA NA ...
..- attr(*, "units")= chr "secs"
Show in New Window
[1] FALSE
Show in New Window
[1] TRUE
Show in New Window
when i do a mean calculation
mean(all_trips_v2$ride_length)
{NA}
so with that I would have anything to aggregate, so it definitely means something is wrong somewhere.
- as.difftime* is a translator. It will not calculate anything so "NA" is exactly what we would expect.
I have not decided whether I am more surprised that your version of the script does not run or that my version does.
Anyway, let's try a completely different approach using pure {lubridate}.
Replace
all_trips$ride_length <- difftime(all_trips$ended_at,all_trips$started_at)
with
all_trips$ride_duration <- as.duration(all_trips$started_at %--% all_trips$ended_at) %>% as.numeric()
This will give you the number of seconds duration in the ride. I just changed ride_length to ride_duration to keep it simple while I was experimenting. The name really does not matter.
Here is a bit of code that should (hopefully) give you some idea of what is happening.
library(tidyverse)
t1 <- ymd_hms("2024-04-22 10:00:00")
t2 <- ymd_hms("2024-04-25 14:15:00")
time.interval <- t1 %--% t2 # example of calculating the time interval.
as.duration(t1 %--% t2) %>% as.numeric()
Alright, thanks will try it now, I had actually given up on using R and moved over to spread sheet. But I will reinstall my R studio now and try this out. Thank you so much for your assistance.
No! No! Not a spreadsheet! Spreadsheets are evil!
More to the point spreadsheets are horribly error-prone.
Here is a very famous case The Reinhart-Rogoff error – or how not to Excel at economics
One of the authors has an Economics Nobel.
This worked perfectly. thank you sooo much.
Alright, would check this out.
Thanks.
This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.