I have a large data set around 6 million rows of data. The data set was created by using bindrows from several different spreadsheets all consisting of the same columns.
I want to create new columns for date, month, year, and day of the week so I run the following code:
all_trips$date <- as.Date(all_trips$start_time)
all_trips$month <- format(as.Date(all_trips$date), "%m")
all_trips$day <- format(as.Date(all_trips$date), "%d")
all_trips$year <- format(as.Date(all_trips$date), "%Y")
all_trips$day_of_week <- format(as.Date(all_trips$date), "%A")
After the code is complete, I run the str(all_trips) function I get :
tibble [6,444,645 × 16] (S3: tbl_df/tbl/data.frame)
$ trip_id : chr [1:6444645] "7C00A93E10556E47" "90854840DFD508BA" "0A7D10CDD144061C" "2F3BE33085BCFF02" ...
$ bikeid : chr [1:6444645] "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...
$ start_time : chr [1:6444645] "11/27/21 13:27" "11/27/21 13:38" "11/26/21 22:03" "11/27/21 9:56" ...
$ end_time : chr [1:6444645] "11/27/21 13:46" "11/27/21 13:56" "11/26/21 22:05" "11/27/21 10:01" ...
$ ride length : 'hms' num [1:6444645] 00:19:00 00:17:45 00:02:22 00:05:01 ...
..- attr(*, "units")= chr "secs"
$ day of the week : chr [1:6444645] "Saturday" "Saturday" "Friday" "Saturday" ...
$ from_station_name: chr [1:6444645] NA NA NA NA ...
$ from_station_id : chr [1:6444645] NA NA NA NA ...
$ to_station_name : chr [1:6444645] NA NA NA NA ...
$ to_station_id : chr [1:6444645] NA NA NA NA ...
$ usertype : chr [1:6444645] "casual" "casual" "casual" "casual" ...
$ date : Date[1:6444645], format: NA NA NA NA ...
$ month : chr [1:6444645] NA NA NA NA ...
$ day : chr [1:6444645] NA NA NA NA ...
$ year : chr [1:6444645] NA NA NA NA ...
$ day_of_week : chr [1:6444645] NA NA NA NA ...
however I keep getting NA's for my new columns, I've also attempted to convert start_time and end time to a numeric format using :
all_trips$started_at = as.numeric(as.character(all_trips$started_at))
all_trips$ended_at = as.numeric(as.character(all_trips$ended_at))
The result of the code is NA conversion errors, this is for a portfolio project any help with this would be greatly appreciated.