How to change col types/format

Sorry guys I am new to the programmed,

my problems is i cannot bind my csv file due to one of the col "ride_length" type is = col_character() while the all the other files is set as = col_time(format "").

so I don't know how to change the types from character to time.

need some advice please help.

context

ride_lengh = col_character() need to change to ride_length = col_time(format = "")

i have read this post however i still does not understand which commands should i use,

Please post some examples of the data that gets imported as characters. Is it like
1:34:20
or something else?

A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata) where mydata is your data. Copy the output and paste it here.

yes my data being imported were 00 : 11 : 45 , however the other data files also imported as 00 : 24 : 45 and still structured as col_time(format = ""). I don't know how to show you the example, I wish I could show it. or should I give the .csv file?

if i put dput(head(mydata, 100)) , the data look like weird to i just keep it dput(head(mydata, 10))

here is the output:

dput(head(jul22,10))
structure(list(ride_id = c("954144C2F67B1932", "292E027607D218B6",
"57765852588AD6E0", "B5B6BE44314590E6", "A4C331F2A00E79E0", "579D73BE2ED880B3",
"EFE518CCEE333669", "315FEBB7B3F6D2EA", "EE3C4A1E66766B56", "1EE6C93A547A187C"
), rideable_type = c("classic_bike", "classic_bike", "classic_bike",
"classic_bike", "classic_bike", "electric_bike", "classic_bike",
"classic_bike", "classic_bike", "electric_bike"), started_at = c("5/7/2022 8:12",
"26/7/2022 12:53", "3/7/2022 13:58", "31/7/2022 17:44", "13/7/2022 19:49",
"1/7/2022 17:04", "18/7/2022 18:11", "28/7/2022 20:38", "10/7/2022 22:55",
"10/7/2022 9:35"), ended_at = c("5/7/2022 8:24", "26/7/2022 12:55",
"3/7/2022 14:06", "31/7/2022 18:42", "13/7/2022 20:15", "1/7/2022 17:13",
"18/7/2022 18:22", "28/7/2022 21:09", "10/7/2022 23:01", "10/7/2022 9:47"
), start_station_name = c("ashland ave & blackhawk st", "buckingham fountain (temp)",
"buckingham fountain (temp)", "buckingham fountain (temp)", "wabash ave & grand ave",
"desplaines st & randolph st", "marquette ave & 89th st", "wabash ave & grand ave",
"wabash ave & grand ave", "ashland ave & blackhawk st"), end_station_name = c("kingsbury st & kinzie st",
"michigan ave & 8th st", "michigan ave & 8th st", "woodlawn ave & 55th st",
"sheffield ave & wellington ave", "clinton st & roosevelt rd",
"east end ave & 87th st", "dearborn pkwy & delaware pl", "dearborn pkwy & delaware pl",
"orleans st & merchandise mart plaza"), start_lat = c(41.907066,
41.86962075, 41.86962075, 41.86962075, 41.891466, 41.88461411,
41.73366879, 41.891466, 41.891466, 41.90709305), start_lng = c(-87.667252,
-87.62398124, -87.62398124, -87.62398124, -87.626761, -87.64456356,
-87.55834222, -87.626761, -87.626761, -87.6672473), end_lat = c(41.88917683,
41.872773, 41.872773, 41.795264, 41.93625348, 41.86711778, 41.73681521,
41.898969, 41.898969, 41.888243), end_lng = c(-87.63850577, -87.623981,
-87.623981, -87.596471, -87.6526621, -87.64108796, -87.58280128,
-87.629912, -87.629912, -87.63639), member_casual = c("member",
"casual", "casual", "casual", "member", "member", "member", "casual",
"member", "member"), ride_length = c("00:11:45", "00:01:53",
"00:07:43", "00:58:29", "00:26:18", "00:08:43", "00:11:29", "00:30:53",
"00:05:33", "00:11:27"), day_of_week = c(3, 3, 1, 1, 4, 6, 2,
5, 1, 1)), row.names = c(NA, -10L), class = c("tbl_df", "tbl",
"data.frame"))

Thanks for posting the data. It shows that the columns started_at, ended_at, and ride_length are all characters. I wrote the data to a csv file and read it back with read_csv from the readr package. The columns started_at and ended_at were still characters but ride_length was read in as an hms time value. I then used the dmy_hm() function from lubridate to make started_at and ended_at into numeric timestamps. Is that what you need?

``` r
DF <- structure(list(ride_id = c("954144C2F67B1932", "292E027607D218B6",
                                 "57765852588AD6E0", "B5B6BE44314590E6", "A4C331F2A00E79E0", "579D73BE2ED880B3",
                                 "EFE518CCEE333669", "315FEBB7B3F6D2EA", "EE3C4A1E66766B56", "1EE6C93A547A187C"
), rideable_type = c("classic_bike", "classic_bike", "classic_bike",
                     "classic_bike", "classic_bike", "electric_bike", "classic_bike",
                     "classic_bike", "classic_bike", "electric_bike"), started_at = c("5/7/2022 8:12",
                                                                                      "26/7/2022 12:53", "3/7/2022 13:58", "31/7/2022 17:44", "13/7/2022 19:49",
                                                                                      "1/7/2022 17:04", "18/7/2022 18:11", "28/7/2022 20:38", "10/7/2022 22:55",
                                                                                      "10/7/2022 9:35"), ended_at = c("5/7/2022 8:24", "26/7/2022 12:55",
                                                                                                                      "3/7/2022 14:06", "31/7/2022 18:42", "13/7/2022 20:15", "1/7/2022 17:13",
                                                                                                                      "18/7/2022 18:22", "28/7/2022 21:09", "10/7/2022 23:01", "10/7/2022 9:47"
                                                                                      ), start_station_name = c("ashland ave & blackhawk st", "buckingham fountain (temp)",
                                                                                                                "buckingham fountain (temp)", "buckingham fountain (temp)", "wabash ave & grand ave",
                                                                                                                "desplaines st & randolph st", "marquette ave & 89th st", "wabash ave & grand ave",
                                                                                                                "wabash ave & grand ave", "ashland ave & blackhawk st"), end_station_name = c("kingsbury st & kinzie st",
                                                                                                                                                                                              "michigan ave & 8th st", "michigan ave & 8th st", "woodlawn ave & 55th st",
                                                                                                                                                                                              "sheffield ave & wellington ave", "clinton st & roosevelt rd",
                                                                                                                                                                                              "east end ave & 87th st", "dearborn pkwy & delaware pl", "dearborn pkwy & delaware pl",
                                                                                                                                                                                              "orleans st & merchandise mart plaza"), start_lat = c(41.907066,
                                                                                                                                                                                                                                                    41.86962075, 41.86962075, 41.86962075, 41.891466, 41.88461411,
                                                                                                                                                                                                                                                    41.73366879, 41.891466, 41.891466, 41.90709305), start_lng = c(-87.667252,
                                                                                                                                                                                                                                                                                                                   -87.62398124, -87.62398124, -87.62398124, -87.626761, -87.64456356,
                                                                                                                                                                                                                                                                                                                   -87.55834222, -87.626761, -87.626761, -87.6672473), end_lat = c(41.88917683,
                                                                                                                                                                                                                                                                                                                                                                                   41.872773, 41.872773, 41.795264, 41.93625348, 41.86711778, 41.73681521,
                                                                                                                                                                                                                                                                                                                                                                                   41.898969, 41.898969, 41.888243), end_lng = c(-87.63850577, -87.623981,
                                                                                                                                                                                                                                                                                                                                                                                                                                 -87.623981, -87.596471, -87.6526621, -87.64108796, -87.58280128,
                                                                                                                                                                                                                                                                                                                                                                                                                                 -87.629912, -87.629912, -87.63639), member_casual = c("member",
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "casual", "casual", "casual", "member", "member", "member", "casual",
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "member", "member"), ride_length = c("00:11:45", "00:01:53",
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            "00:07:43", "00:58:29", "00:26:18", "00:08:43", "00:11:29", "00:30:53",
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            "00:05:33", "00:11:27"), day_of_week = c(3, 3, 1, 1, 4, 6, 2,
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     5, 1, 1)), row.names = c(NA, -10L), class = c("tbl_df", "tbl",
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   "data.frame"))
str(DF)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    10 obs. of  13 variables:
#>  $ ride_id           : chr  "954144C2F67B1932" "292E027607D218B6" "57765852588AD6E0" "B5B6BE44314590E6" ...
#>  $ rideable_type     : chr  "classic_bike" "classic_bike" "classic_bike" "classic_bike" ...
#>  $ started_at        : chr  "5/7/2022 8:12" "26/7/2022 12:53" "3/7/2022 13:58" "31/7/2022 17:44" ...
#>  $ ended_at          : chr  "5/7/2022 8:24" "26/7/2022 12:55" "3/7/2022 14:06" "31/7/2022 18:42" ...
#>  $ start_station_name: chr  "ashland ave & blackhawk st" "buckingham fountain (temp)" "buckingham fountain (temp)" "buckingham fountain (temp)" ...
#>  $ end_station_name  : chr  "kingsbury st & kinzie st" "michigan ave & 8th st" "michigan ave & 8th st" "woodlawn ave & 55th st" ...
#>  $ start_lat         : num  41.9 41.9 41.9 41.9 41.9 ...
#>  $ start_lng         : num  -87.7 -87.6 -87.6 -87.6 -87.6 ...
#>  $ end_lat           : num  41.9 41.9 41.9 41.8 41.9 ...
#>  $ end_lng           : num  -87.6 -87.6 -87.6 -87.6 -87.7 ...
#>  $ member_casual     : chr  "member" "casual" "casual" "casual" ...
#>  $ ride_length       : chr  "00:11:45" "00:01:53" "00:07:43" "00:58:29" ...
#>  $ day_of_week       : num  3 3 1 1 4 6 2 5 1 1
write.csv(DF, "~/R/Play/Dummy.csv", row.names = FALSE)
DF_IN <- readr::read_csv("~/R/Play/Dummy.csv")
#> Rows: 10 Columns: 13
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (7): ride_id, rideable_type, started_at, ended_at, start_station_name, ...
#> dbl  (5): start_lat, start_lng, end_lat, end_lng, day_of_week
#> time (1): ride_length
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
str(DF_IN)
#> spc_tbl_ [10 × 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
#>  $ ride_id           : chr [1:10] "954144C2F67B1932" "292E027607D218B6" "57765852588AD6E0" "B5B6BE44314590E6" ...
#>  $ rideable_type     : chr [1:10] "classic_bike" "classic_bike" "classic_bike" "classic_bike" ...
#>  $ started_at        : chr [1:10] "5/7/2022 8:12" "26/7/2022 12:53" "3/7/2022 13:58" "31/7/2022 17:44" ...
#>  $ ended_at          : chr [1:10] "5/7/2022 8:24" "26/7/2022 12:55" "3/7/2022 14:06" "31/7/2022 18:42" ...
#>  $ start_station_name: chr [1:10] "ashland ave & blackhawk st" "buckingham fountain (temp)" "buckingham fountain (temp)" "buckingham fountain (temp)" ...
#>  $ end_station_name  : chr [1:10] "kingsbury st & kinzie st" "michigan ave & 8th st" "michigan ave & 8th st" "woodlawn ave & 55th st" ...
#>  $ start_lat         : num [1:10] 41.9 41.9 41.9 41.9 41.9 ...
#>  $ start_lng         : num [1:10] -87.7 -87.6 -87.6 -87.6 -87.6 ...
#>  $ end_lat           : num [1:10] 41.9 41.9 41.9 41.8 41.9 ...
#>  $ end_lng           : num [1:10] -87.6 -87.6 -87.6 -87.6 -87.7 ...
#>  $ member_casual     : chr [1:10] "member" "casual" "casual" "casual" ...
#>  $ ride_length       : 'hms' num [1:10] 00:11:45 00:01:53 00:07:43 00:58:29 ...
#>   ..- attr(*, "units")= chr "secs"
#>  $ day_of_week       : num [1:10] 3 3 1 1 4 6 2 5 1 1
#>  - attr(*, "spec")=
#>   .. cols(
#>   ..   ride_id = col_character(),
#>   ..   rideable_type = col_character(),
#>   ..   started_at = col_character(),
#>   ..   ended_at = col_character(),
#>   ..   start_station_name = col_character(),
#>   ..   end_station_name = col_character(),
#>   ..   start_lat = col_double(),
#>   ..   start_lng = col_double(),
#>   ..   end_lat = col_double(),
#>   ..   end_lng = col_double(),
#>   ..   member_casual = col_character(),
#>   ..   ride_length = col_time(format = ""),
#>   ..   day_of_week = col_double()
#>   .. )
#>  - attr(*, "problems")=<externalptr>
library(dplyr)

library(lubridate)

DF_IN <- DF_IN |> mutate(started_at = dmy_hm(started_at),
                         ended_at = dmy_hm(ended_at))
str(DF_IN)
#> tibble [10 × 13] (S3: tbl_df/tbl/data.frame)
#>  $ ride_id           : chr [1:10] "954144C2F67B1932" "292E027607D218B6" "57765852588AD6E0" "B5B6BE44314590E6" ...
#>  $ rideable_type     : chr [1:10] "classic_bike" "classic_bike" "classic_bike" "classic_bike" ...
#>  $ started_at        : POSIXct[1:10], format: "2022-07-05 08:12:00" "2022-07-26 12:53:00" ...
#>  $ ended_at          : POSIXct[1:10], format: "2022-07-05 08:24:00" "2022-07-26 12:55:00" ...
#>  $ start_station_name: chr [1:10] "ashland ave & blackhawk st" "buckingham fountain (temp)" "buckingham fountain (temp)" "buckingham fountain (temp)" ...
#>  $ end_station_name  : chr [1:10] "kingsbury st & kinzie st" "michigan ave & 8th st" "michigan ave & 8th st" "woodlawn ave & 55th st" ...
#>  $ start_lat         : num [1:10] 41.9 41.9 41.9 41.9 41.9 ...
#>  $ start_lng         : num [1:10] -87.7 -87.6 -87.6 -87.6 -87.6 ...
#>  $ end_lat           : num [1:10] 41.9 41.9 41.9 41.8 41.9 ...
#>  $ end_lng           : num [1:10] -87.6 -87.6 -87.6 -87.6 -87.7 ...
#>  $ member_casual     : chr [1:10] "member" "casual" "casual" "casual" ...
#>  $ ride_length       : 'hms' num [1:10] 00:11:45 00:01:53 00:07:43 00:58:29 ...
#>   ..- attr(*, "units")= chr "secs"
#>  $ day_of_week       : num [1:10] 3 3 1 1 4 6 2 5 1 1

Created on 2023-08-09 with reprex v2.0.2

actually what I need is, for the ride_length to be read as hms time value. How does your readr read the csv file into hms time value? I already set the same format in the csv file however the result come out from readr is ride_length as character.

Here are the contents of the csv file I read in.

"ride_id","rideable_type","started_at","ended_at","start_station_name","end_station_name","start_lat","start_lng","end_lat","end_lng","member_casual","ride_length","day_of_week"
"954144C2F67B1932","classic_bike","5/7/2022 8:12","5/7/2022 8:24","ashland ave & blackhawk st","kingsbury st & kinzie st",41.907066,-87.667252,41.88917683,-87.63850577,"member","00:11:45",3
"292E027607D218B6","classic_bike","26/7/2022 12:53","26/7/2022 12:55","buckingham fountain (temp)","michigan ave & 8th st",41.86962075,-87.62398124,41.872773,-87.623981,"casual","00:01:53",3
"57765852588AD6E0","classic_bike","3/7/2022 13:58","3/7/2022 14:06","buckingham fountain (temp)","michigan ave & 8th st",41.86962075,-87.62398124,41.872773,-87.623981,"casual","00:07:43",1
"B5B6BE44314590E6","classic_bike","31/7/2022 17:44","31/7/2022 18:42","buckingham fountain (temp)","woodlawn ave & 55th st",41.86962075,-87.62398124,41.795264,-87.596471,"casual","00:58:29",1
"A4C331F2A00E79E0","classic_bike","13/7/2022 19:49","13/7/2022 20:15","wabash ave & grand ave","sheffield ave & wellington ave",41.891466,-87.626761,41.93625348,-87.6526621,"member","00:26:18",4
"579D73BE2ED880B3","electric_bike","1/7/2022 17:04","1/7/2022 17:13","desplaines st & randolph st","clinton st & roosevelt rd",41.88461411,-87.64456356,41.86711778,-87.64108796,"member","00:08:43",6
"EFE518CCEE333669","classic_bike","18/7/2022 18:11","18/7/2022 18:22","marquette ave & 89th st","east end ave & 87th st",41.73366879,-87.55834222,41.73681521,-87.58280128,"member","00:11:29",2
"315FEBB7B3F6D2EA","classic_bike","28/7/2022 20:38","28/7/2022 21:09","wabash ave & grand ave","dearborn pkwy & delaware pl",41.891466,-87.626761,41.898969,-87.629912,"casual","00:30:53",5
"EE3C4A1E66766B56","classic_bike","10/7/2022 22:55","10/7/2022 23:01","wabash ave & grand ave","dearborn pkwy & delaware pl",41.891466,-87.626761,41.898969,-87.629912,"member","00:05:33",1
"1EE6C93A547A187C","electric_bike","10/7/2022 9:35","10/7/2022 9:47","ashland ave & blackhawk st","orleans st & merchandise mart plaza",41.90709305,-87.6672473,41.888243,-87.63639,"member","00:11:27",1

Make a csv from that and read it in using read_csv with just the file name as I did in my code.

DF_IN <- readr::read_csv("~/R/Play/Dummy.csv")

What is the class of the ride_length column?

the ride_length class should be Time with hms format. Technically i have annual data of the ride but the class difference only occur in this file (july) but does not happen to the other month, so i have to fixed the class to bind it with other month

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.