Hi all,
This is my first post in R community and is really great to meet everyone here!
I'm currently facing a problem where I could really use some help here! I have read in a csv file where 2 of the datetime columns (started_at & ended_at) are read in as a character variable type. I used lubridate function to convert it into datetime but when I check back on its type, it is still classified as a character variable. Am I missing something here?
I have uploaded the file and R-code is pasted below for your reference.
library(tidyverse)
library(lubridate)
library(ggplot2)
library(magrittr)
Oct2021 <- read_csv("202110-divvy-tripdata.csv")
str(Oct2021)
#Output
spec_tbl_df [631,226 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
- $ ride_id : chr [1:631226] "620BC6107255BF4C" "4471C70731AB2E45" "26CA69D43D15EE14" "362947F0437E1514" ...*
- $ rideable_type : chr [1:631226] "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...*
- $ started_at : chr [1:631226] "22-10-21 12:46" "21-10-21 9:12" "16-10-21 16:28" "16-10-21 16:17" ...*
- $ ended_at : chr [1:631226] "22-10-21 12:49" "21-10-21 9:14" "16-10-21 16:36" "16-10-21 16:19" ...*
- $ start_station_name: chr [1:631226] "Kingsbury St & Kinzie St" NA NA NA ...*
- $ start_station_id : chr [1:631226] "KA1503000043" NA NA NA ...*
- $ end_station_name : chr [1:631226] NA NA NA NA ...*
- $ end_station_id : chr [1:631226] NA NA NA NA ...*
- $ start_lat : num [1:631226] 41.9 41.9 41.9 41.9 41.9 ...*
- $ start_lng : num [1:631226] -87.6 -87.7 -87.7 -87.7 -87.7 ...*
- $ end_lat : num [1:631226] 41.9 41.9 41.9 41.9 41.9 ...*
- $ end_lng : num [1:631226] -87.6 -87.7 -87.7 -87.7 -87.7 ...*
- $ member_casual : chr [1:631226] "member" "member" "member" "member" ...*
-
- attr(, "spec")=
- .. cols(*
- .. ride_id = col_character(),*
- .. rideable_type = col_character(),*
- .. started_at = col_character(),*
- .. ended_at = col_character(),*
- .. start_station_name = col_character(),*
- .. start_station_id = col_character(),*
- .. end_station_name = col_character(),*
- .. end_station_id = col_character(),*
- .. start_lat = col_double(),*
- .. start_lng = col_double(),*
- .. end_lat = col_double(),*
- .. end_lng = col_double(),*
- .. member_casual = col_character()*
- .. )*
-
- attr(, "problems")=
Oct2021 %>%
mutate(started_at = lubridate::dmy_hm(started_at), ended_at = lubridate::dmy_hm(ended_at))
#Output
# A tibble: 631,226 x 13
- ride_id rideable_type started_at ended_at start_station_na~ start_station_id end_station_name*
- *
- 1 620BC6107~ electric_bike 2021-10-22 12:46:00 2021-10-22 12:49:00 Kingsbury St & K~ KA1503000043 NA *
- 2 4471C7073~ electric_bike 2021-10-21 09:12:00 2021-10-21 09:14:00 NA NA NA *
- 3 26CA69D43~ electric_bike 2021-10-16 16:28:00 2021-10-16 16:36:00 NA NA NA *
- 4 362947F04~ electric_bike 2021-10-16 16:17:00 2021-10-16 16:19:00 NA NA NA *
- 5 BB731DE2F~ electric_bike 2021-10-20 23:17:00 2021-10-20 23:26:00 NA NA NA *
- 6 7176307BB~ electric_bike 2021-10-21 16:57:00 2021-10-21 17:11:00 NA NA NA *
- 7 E965A0415~ electric_bike 2021-10-21 17:46:00 2021-10-21 17:49:00 NA NA NA *
- 8 E41D986E8~ electric_bike 2021-10-20 23:30:00 2021-10-20 23:38:00 NA NA NA *
- 9 E189D96E3~ electric_bike 2021-10-21 18:17:00 2021-10-21 18:24:00 NA NA NA *
*10 17019B8A4~ electric_bike 2021-10-06 18:47:00 2021-10-06 18:56:00 NA NA NA *
# ... with 631,216 more rows, and 6 more variables: end_station_id , start_lat , start_lng ,
# end_lat , end_lng , member_casual
Double-check on variable type again
str(Oct2021)
#Output
spec_tbl_df [631,226 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
- $ ride_id : chr [1:631226] "620BC6107255BF4C" "4471C70731AB2E45" "26CA69D43D15EE14" "362947F0437E1514" ...*
- $ rideable_type : chr [1:631226] "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...*
- $ started_at : chr [1:631226] "22-10-21 12:46" "21-10-21 9:12" "16-10-21 16:28" "16-10-21 16:17" ...*
- $ ended_at : chr [1:631226] "22-10-21 12:49" "21-10-21 9:14" "16-10-21 16:36" "16-10-21 16:19" ...*
- $ start_station_name: chr [1:631226] "Kingsbury St & Kinzie St" NA NA NA ...*
- $ start_station_id : chr [1:631226] "KA1503000043" NA NA NA ...*
- $ end_station_name : chr [1:631226] NA NA NA NA ...*
- $ end_station_id : chr [1:631226] NA NA NA NA ...*
- $ start_lat : num [1:631226] 41.9 41.9 41.9 41.9 41.9 ...*
- $ start_lng : num [1:631226] -87.6 -87.7 -87.7 -87.7 -87.7 ...*
- $ end_lat : num [1:631226] 41.9 41.9 41.9 41.9 41.9 ...*
- $ end_lng : num [1:631226] -87.6 -87.7 -87.7 -87.7 -87.7 ...*
- $ member_casual : chr [1:631226] "member" "member" "member" "member" ...*
-
- attr(, "spec")=
- .. cols(*
- .. ride_id = col_character(),*
- .. rideable_type = col_character(),*
- .. started_at = col_character(),*
- .. ended_at = col_character(),*
- .. start_station_name = col_character(),*
- .. start_station_id = col_character(),*
- .. end_station_name = col_character(),*
- .. end_station_id = col_character(),*
- .. start_lat = col_double(),*
- .. start_lng = col_double(),*
- .. end_lat = col_double(),*
- .. end_lng = col_double(),*
- .. member_casual = col_character()*
- .. )*
-
- attr(, "problems")=
Let me know if you need anything else from me or any part is not explained clearly, happy to elaborate! Any form of help here is appreciated!
Thanks!