Changing format from <chr> to <date> changes year?!

I am quite new to R, and I am trying to change a column where the date is formatted as chr and I would like it to be date.

I have tried the following chunk:

activity_level$ActivityDay <- as.Date(activity_level$ActivityDay, "%m/%d/%Y")

But it changes the date year from 2016 to 2020. This occurs with all the data frames, and I can't figure out why. I have not been able to find an answer that seems to work. Any suggestions? Thanks!

Can you post a minimal example (including any loading of libraries, creation of a small data set, and your conversion code)?

I'll do my best, but I am painfully new to this!

Installing andlLoading packages

library(tidyverse)
library(dplyr)
library(ggplot2)
library(janitor)
library(lubridate)

data set used/created:
activity_level <- read_csv("dailyIntensities_merged.csv") # Daily minutes per activity level
I have tried the following chunk using Lubridate:

activity_level$ActivityDay <- as.Date(activity_level$ActivityDay, "%m/%d/%Y")

But it changes the date year from 2016 to 2020.

example:

4/12/2016 becomes 2020-04-12 instead of 2016-04-12

My 'Y' is capitalized. The day and month remain correct.

I even tried:

activity_level$ActivityHour=as.POSIXlt(activity_level$ActivityHour, format="%m/%d/%Y")

but the same thing happened. All the dates in the data should be 2016, but all convert to 2020.

I have uploaded the two screenshots of my tibble before and after.

Thank you so much for your response and please let me know if you need more information!

Second tibble as I could only upload one at a time.

Please run this code and post the output.

activity_level <- read_csv("dailyIntensities_merged.csv")
dput(head(activity_level))

Place the output between lines of three back ticks, like this
```
code output goes here
```

Hi @across, if your data set lool like this, you could make this:

# Install and load the lubridate package if not already installed
# install.packages("lubridate")
library(lubridate)

# Your original data frame
activity_level <- data.frame(
  ActivityDay = c("01/15/2016", "02/20/2011", "03/25/2014"),
  ActivityLevel = c(10, 15, 8)
)

# Convert the ActivityDay column to a date using lubridate
activity_level$ActivityDay <- lubridate::mdy(activity_level$ActivityDay)

# Print the updated data frame
activity_level

Your second tibble has different dimensions than the first one (6 columns rather than 10), so something more than the date conversion must have gone on. I was unable to reproduce the error (using a manually created data frame, since we don't have the CSV file to test on).

I get this:

Rows: 940 Columns: 10── Column specification ─────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityDay
dbl (9): Id, SedentaryMinutes, LightlyActiveMinutes, FairlyActiveMinutes, VeryActiveMinutes, SedentaryActiveD...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.structure(list(Id = c(1503960366, 1503960366, 1503960366, 1503960366, 
1503960366, 1503960366), ActivityDay = c("4/12/2016", "4/13/2016", 
"4/14/2016", "4/15/2016", "4/16/2016", "4/17/2016"), SedentaryMinutes = c(728, 
776, 1218, 726, 773, 539), LightlyActiveMinutes = c(328, 217, 
181, 209, 221, 164), FairlyActiveMinutes = c(13, 19, 11, 34, 
10, 20), VeryActiveMinutes = c(25, 21, 30, 29, 36, 38), SedentaryActiveDistance = c(0, 
0, 0, 0, 0, 0), LightActiveDistance = c(6.05999994277954, 4.71000003814697, 
3.91000008583069, 2.82999992370605, 5.03999996185303, 2.50999999046326
), ModeratelyActiveDistance = c(0.550000011920929, 0.689999997615814, 
0.400000005960464, 1.25999999046326, 0.409999996423721, 0.779999971389771
), VeryActiveDistance = c(1.87999999523163, 1.57000005245209, 
2.44000005722046, 2.14000010490417, 2.71000003814697, 3.19000005722046
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

and this if I remove the error
A tibble:6 × 10

Id

ActivityDay

SedentaryMinutes

LightlyActiveMinutes

FairlyActiveMinutes

VeryActiveMinutes

SedentaryActiveDistance

1503960366 4/12/2016 728 328 13 25 0 1503960366 4/13/2016 776 217 19 21 0 1503960366 4/14/2016 1218 181 11 30 0 1503960366 4/15/2016 726 209 34 29 0 1503960366 4/16/2016 773 221 10 36 0 1503960366 4/17/2016 539 164 20 38 0

6 rows | 1-7 of 10 columns

Try running exactly the lines of code below. Do you get the same result I do?

activity_level <- structure(list(Id = c(1503960366, 1503960366, 1503960366, 1503960366, 
                                        1503960366, 1503960366), 
                                 ActivityDay = c("4/12/2016", "4/13/2016", "4/14/2016", "4/15/2016", "4/16/2016", "4/17/2016"), 
                                 SedentaryMinutes = c(728, 776, 1218, 726, 773, 539), 
                                 LightlyActiveMinutes = c(328, 217, 181, 209, 221, 164), 
                                 FairlyActiveMinutes = c(13, 19, 11, 34, 10, 20), 
                                 VeryActiveMinutes = c(25, 21, 30, 29, 36, 38), 
                                 SedentaryActiveDistance = c(0, 0, 0, 0, 0, 0), 
                                 LightActiveDistance = c(6.05999994277954, 4.71000003814697, 3.91000008583069, 2.82999992370605, 5.03999996185303, 2.50999999046326), 
                                 ModeratelyActiveDistance = c(0.550000011920929, 0.689999997615814, 0.400000005960464, 1.25999999046326, 0.409999996423721, 0.779999971389771), 
                                 VeryActiveDistance = c(1.87999999523163, 1.57000005245209, 2.44000005722046, 2.14000010490417, 2.71000003814697, 3.19000005722046)), 
                            row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
activity_level
#>           Id ActivityDay SedentaryMinutes LightlyActiveMinutes
#> 1 1503960366   4/12/2016              728                  328
#> 2 1503960366   4/13/2016              776                  217
#> 3 1503960366   4/14/2016             1218                  181
#> 4 1503960366   4/15/2016              726                  209
#> 5 1503960366   4/16/2016              773                  221
#> 6 1503960366   4/17/2016              539                  164
#>   FairlyActiveMinutes VeryActiveMinutes SedentaryActiveDistance
#> 1                  13                25                       0
#> 2                  19                21                       0
#> 3                  11                30                       0
#> 4                  34                29                       0
#> 5                  10                36                       0
#> 6                  20                38                       0
#>   LightActiveDistance ModeratelyActiveDistance VeryActiveDistance
#> 1                6.06                     0.55               1.88
#> 2                4.71                     0.69               1.57
#> 3                3.91                     0.40               2.44
#> 4                2.83                     1.26               2.14
#> 5                5.04                     0.41               2.71
#> 6                2.51                     0.78               3.19
activity_level$ActivityDay <- as.Date(activity_level$ActivityDay, format = "%m/%d/%Y")
activity_level
#>           Id ActivityDay SedentaryMinutes LightlyActiveMinutes
#> 1 1503960366  2016-04-12              728                  328
#> 2 1503960366  2016-04-13              776                  217
#> 3 1503960366  2016-04-14             1218                  181
#> 4 1503960366  2016-04-15              726                  209
#> 5 1503960366  2016-04-16              773                  221
#> 6 1503960366  2016-04-17              539                  164
#>   FairlyActiveMinutes VeryActiveMinutes SedentaryActiveDistance
#> 1                  13                25                       0
#> 2                  19                21                       0
#> 3                  11                30                       0
#> 4                  34                29                       0
#> 5                  10                36                       0
#> 6                  20                38                       0
#>   LightActiveDistance ModeratelyActiveDistance VeryActiveDistance
#> 1                6.06                     0.55               1.88
#> 2                4.71                     0.69               1.57
#> 3                3.91                     0.40               2.44
#> 4                2.83                     1.26               2.14
#> 5                5.04                     0.41               2.71
#> 6                2.51                     0.78               3.19

Created on 2024-01-24 with reprex v2.0.2

1 Like

Thanks for your reply!

I get this: Error: attempt to use zero-length variable name

!
Which line produces the error?

Solved it with this piece of code!

activity_level <-
  activity_level |>
  mutate(
    ActivityDay = readr::parse_date(
      ActivityDay,
      format = "%m/%d/%Y"
    )
  )

Solved! But thank you so much for your help!

Solved!
thank you so much.
.

thank you so much for your help!.
.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.