I want to split a date time column/variable (character datatypes) into two columns one for the date and one for the time

Is there anyone can help me in separating one column having a values of date time (e.g. 01/08/2020 12:00:00 AM ) into DATE column and TIME column? I tried to use the mutate function of tidyverse to do it. But, it gives me error. Here is the code I made and the error below:

df <- mutate(unclean_df,

  •          date_reported = as.Date(unclean_df$date_rptd, format = "%m-%d-%Y"),
    
  •          date_occurred = as.Date(unclean_df$date_occ, format = "%m-%d-%Y"),
    
  •          time_reported = as.POSIXct(unclean_df$date_rptd),
    
  •          time_occurred = as.POSIXct(unclean_df$date_occ))
    

Error in mutate():
:information_source: In argument: time_reported = as.POSIXct(unclean_df$date_rptd).
Caused by error in as.POSIXlt.character():
! character string is not in a standard unambiguous format
Run rlang::last_trace() to see where the error occurred.

A string representation that is converted to a POSIXct datetime object then back to a date component and a string component is going to end up as another string reperesentationβ€”converting to a datetime object is unnecessary.

(input <- gsub(".[A|P]M$","","11/6/23 15:30:00 PM"))
#> [1] "11/6/23 15:30:00"
(input <- strsplit(input," ") |> unlist())
#> [1] "11/6/23"  "15:30:00"
(Date <- input[1])
#> [1] "11/6/23"
(Time <- input[2])
#> [1] "15:30:00"
(d <- data.frame(Date = Date, Time = Time))
#>      Date     Time
#> 1 11/6/23 15:30:00

# optional
make_iso <- function(x) lubridate::mdy(x)
(d$Date <- make_iso(d$Date))
#> [1] "2023-11-06"

Created on 2023-11-06 with reprex v2.0.2

I appreciate your help with this.

By the way, does it also work on dataframe (data sets)?

Sure. Anything you can do to a character string you can to to a character vector in a dataframe. If you need help, open a new thread with a reprex (see the FAQ.

I have a question for you. I have this data sets. So, when I read_csv("file") of the particular file. it gives me this error below:

── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (17): DR_NO, Date Rptd, DATE OCC, TIME OCC, AREA, AREA NAME, Rpt Dist No, Crm Cd Desc, Mocodes, Vict Sex, Vict ...
dbl (9): Part 1-2, Crm Cd, Vict Age, Premis Cd, Weapon Used Cd, Crm Cd 1, Crm Cd 2, LAT, LON
lgl (2): Crm Cd 3, Crm Cd 4

:information_source: Use spec() to retrieve the full column specification for this data.
:information_source: Specify the column types or set show_col_types = FALSE to quiet this message.
Warning message:
One or more parsing issues, call problems() on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)

Now, I run the glimpse and class functions of a particular column in the dataframe and it stated that it is a "character/chr" datatypes. Such as the output below.

  • $ date_rptd "01/08/2020 12:00:00 AM", "01/02/2020 12:00:00 AM", "04/14/2020 12:00:00 AM", "01/01/2020 12:00:00

  • date_rptd

  •  <chr>                 
    
  • 01/08/2020 12:00:00 AM

  • 01/02/2020 12:00:00 AM

  • 04/14/2020 12:00:00 AM

  • 01/01/2020 12:00:00 AM

  • 01/01/2020 12:00:00 AM

  • 01/02/2020 12:00:00 AM

  • 01/02/2020 12:00:00 AM

  • 01/04/2020 12:00:00 AM

  • 01/04/2020 12:00:00 AM

  • 06/19/2020 12:00:00 AM

  • :information_source: 829,768 more rows

  • :information_source: Use print(n = ...) to see more rows

Now, I tried to change the datatypes of the column "date_rptd" into ymd_hms like this: df$date_rptd <- ymd_hms(df$date_rptd). It gives me an error such as this below:

Warning message:
All formats failed to parse. No formats found.

And, it turned the df$date_rptd values into all NAs. It the same goes to, df$date_rptd <- mdy(df$date_rptd) in which it turned the values into NAs.

However, when I tried once more with df$date_rptd <- mdy_hms(df$date_rptd). It does not gave me an error. Instead, the time is gone on the dataframe/tibble such output below:

  • date_rptd
  •  <date>                 
    
  • 01/08/2020
  • 01/02/2020
  • 04/14/2020
  • 01/01/2020
  • 01/01/2020
  • 01/02/2020
  • 01/02/2020
  • 01/04/2020
  • 01/04/2020
  • 06/19/2020
  • :information_source: 829,768 more rows

  • :information_source: Use print(n = ...) to see more rows

Why is that? Is it the data set I have has some issues itself in which R does not read it properly and unable to convert it to date/datetime?

Date handling depends primarily on whether you want to use something representing a date as a date object, a datetime object or a formatted character string for use in some output, such as a table in a report.

So, starting with a character string

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
input <- "11/09/2023 18:08:23"
a_date <- mdy_hms(input) 
# it's really a number--it's just displayed as 
# a date with time and the default TZ
typeof(a_date)
#> [1] "double"
str(a_date)
#>  POSIXct[1:1], format: "2023-11-09 18:08:23"
as.numeric(a_date) # seconds into the Unix Epoch
#> [1] 1699553303
# you can add to it; this increments by an hour
a_date + 60*60
#> [1] "2023-11-09 19:08:23 UTC"
# you can keep its class and output it formatted
# various ways
format(a_date, format = "%Y-%m-%d %H:%M:%S")
#> [1] "2023-11-09 18:08:23"
format(a_date, format = "%B %d, %Y")
#> [1] "November 09, 2023"
# convert it in various ways by assigning any of the above
date_string <- format(a_date, format = "%B %d, %Y")
date_string |> str()
#>  chr "November 09, 2023"

Created on 2023-11-09 with [reprex v2.0.2]

2 Likes

I appreciated your effort in explaining it to me. As a newbie to this field, you are very approachable, kind and you are very patient. I'll do my best to understand this R for data analysis. By the way, what is your opinion about the data analyst job 5 years from now? Do you think AI will take over the data analyst job?

Glad to help. I see AI being a big part of the job of a data analyst to boost productivity by offloading much of the effort required to keep syntax straight, including things as simple as balancing nested braces to getting quick answers to the kind of questions you see here to mini-tutorials to help understand machine learning tools you haven’t used before. For being able effectively take advantage the analyst should have a problem solving mindset that emphasizes what over how. Prompt engineering for AI isn’t immune from Garbage In/Garbage Out. Most of all the analyst must be the skeptic who always is applying reality checks.

1 Like

Thanks! I appreciated it. :+1:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.