Error with date format

Hi everybody,
I would like to transfer date from the format in the data below into the format yyyy-mm-dd

df<-data.frame(
          rf = c(7, 7.2, 7.6, 7.35, 7.7, 7.9, 7.6, 7.6, 7.35, 8.45),
        date = as.factor(c("31-07-07","31-08-07",
                           "30-09-07","31-10-07","30-11-07","31-12-07",
                           "31-01-08","29-02-08","31-03-08","30-04-08"))
)

I tried some code but fail. The code I tried

The result turns to NA value

df<-read.csv("rf1.csv")
df$date<-as.character(df$date)
df$date<-as.Date(df$date,format='%d-%b-%y')

The result turns to 0031 value at year

df<-read.csv("rf1.csv")
df$date<-as.Date(df$date)
df$date<-as.Date(df$date,format='%d-%b-%y')
class(df$date)

The result turns to 2031-07-12, wrong year

library(lubridate)
df<-read.csv("rf1.csv")
ymd(df$date)

I would appreciate any help.
Thank you

If the original data are in the format day-month-year, you should use the dmy() function from lubridate.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
df<-data.frame(
  rf = c(7, 7.2, 7.6, 7.35, 7.7, 7.9, 7.6, 7.6, 7.35, 8.45),
  date = as.factor(c("31-07-07","31-08-07",
                     "30-09-07","31-10-07","30-11-07","31-12-07",
                     "31-01-08","29-02-08","31-03-08","30-04-08"))
)
df$date<-as.character(df$date)
df$date <- dmy(df$date)
df
#>      rf       date
#> 1  7.00 2007-07-31
#> 2  7.20 2007-08-31
#> 3  7.60 2007-09-30
#> 4  7.35 2007-10-31
#> 5  7.70 2007-11-30
#> 6  7.90 2007-12-31
#> 7  7.60 2008-01-31
#> 8  7.60 2008-02-29
#> 9  7.35 2008-03-31
#> 10 8.45 2008-04-30

Created on 2020-01-23 by the reprex package (v0.3.0)

1 Like

Thanks for the data part of the FAQ: What's a reproducible example (`reprex`) and how do I do one?

A preliminary issue. date is a base function, always in your namespace, and you should avoid using it as a variable name. Same with df, dat is a good substitute.

Second, lubridate takes its best shot at parsing the strings but it isn't able to distinguish between centuries without any information.

So, to begin, let's do the reprex with the wrong answer.

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(lubridate)) 
dat <-data.frame(
          rf = c(7, 7.2, 7.6, 7.35, 7.7, 7.9, 7.6, 7.6, 7.35, 8.45),
        date = as.factor(c("31-07-07","31-08-07",
                           "30-09-07","31-10-07","30-11-07","31-12-07",
                           "31-01-08","29-02-08","31-03-08","30-04-08"))
)

colnames(dat) <- c("rf", "Date")
# if factors are really needed can be converted back
dat <- dat %>% mutate(Date = as.character(Date))
dat <- dat %>% mutate(Date = ymd(Date))
dat
#>      rf       Date
#> 1  7.00 2031-07-07
#> 2  7.20 2031-08-07
#> 3  7.60 2030-09-07
#> 4  7.35 2031-10-07
#> 5  7.70 2030-11-07
#> 6  7.90 2031-12-07
#> 7  7.60 2031-01-08
#> 8  7.60 2029-02-08
#> 9  7.35 2031-03-08
#> 10 8.45 2030-04-08
dat <- dat %>% mutate(Date = Date - years(100))
dat
#>      rf       Date
#> 1  7.00 1931-07-07
#> 2  7.20 1931-08-07
#> 3  7.60 1930-09-07
#> 4  7.35 1931-10-07
#> 5  7.70 1930-11-07
#> 6  7.90 1931-12-07
#> 7  7.60 1931-01-08
#> 8  7.60 1929-02-08
#> 9  7.35 1931-03-08
#> 10 8.45 1930-04-08

Created on 2020-01-23 by the reprex package (v0.3.0)

1 Like

@FJCC @technocrat
Thank you so much for your quick reply.
Best regards,

Great. mark a solution FAQ: How do I mark a solution? for the benefit of those to follow.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.