I have a question for you. I have this data sets. So, when I read_csv("file") of the particular file. it gives me this error below:
ββ Column specification βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Delimiter: ","
chr (17): DR_NO, Date Rptd, DATE OCC, TIME OCC, AREA, AREA NAME, Rpt Dist No, Crm Cd Desc, Mocodes, Vict Sex, Vict ...
dbl (9): Part 1-2, Crm Cd, Vict Age, Premis Cd, Weapon Used Cd, Crm Cd 1, Crm Cd 2, LAT, LON
lgl (2): Crm Cd 3, Crm Cd 4
Use spec()
to retrieve the full column specification for this data.
Specify the column types or set show_col_types = FALSE
to quiet this message.
Warning message:
One or more parsing issues, call problems()
on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)
Now, I run the glimpse and class functions of a particular column in the dataframe and it stated that it is a "character/chr" datatypes. Such as the output below.
-
$ date_rptd "01/08/2020 12:00:00 AM", "01/02/2020 12:00:00 AM", "04/14/2020 12:00:00 AM", "01/01/2020 12:00:00
-
date_rptd
-
<chr>
-
01/08/2020 12:00:00 AM
-
01/02/2020 12:00:00 AM
-
04/14/2020 12:00:00 AM
-
01/01/2020 12:00:00 AM
-
01/01/2020 12:00:00 AM
-
01/02/2020 12:00:00 AM
-
01/02/2020 12:00:00 AM
-
01/04/2020 12:00:00 AM
-
01/04/2020 12:00:00 AM
-
06/19/2020 12:00:00 AM
-
829,768 more rows
-
Use print(n = ...)
to see more rows
Now, I tried to change the datatypes of the column "date_rptd" into ymd_hms like this: df$date_rptd <- ymd_hms(df$date_rptd). It gives me an error such as this below:
Warning message:
All formats failed to parse. No formats found.
And, it turned the df$date_rptd values into all NAs. It the same goes to, df$date_rptd <- mdy(df$date_rptd) in which it turned the values into NAs.
However, when I tried once more with df$date_rptd <- mdy_hms(df$date_rptd). It does not gave me an error. Instead, the time is gone on the dataframe/tibble such output below:
- date_rptd
-
<date>
- 01/08/2020
- 01/02/2020
- 04/14/2020
- 01/01/2020
- 01/01/2020
- 01/02/2020
- 01/02/2020
- 01/04/2020
- 01/04/2020
- 06/19/2020
-
829,768 more rows
-
Use print(n = ...)
to see more rows
Why is that? Is it the data set I have has some issues itself in which R does not read it properly and unable to convert it to date/datetime?