Hi, I'm new to R and I'm having trouble with dates. I've uploaded a csv file, which contains a column of dates in the format month-year (e.g. Jan-16). This is recognised by R as a factor.
I'd like to change this format to mm/yyyy (e.g. 01/2016) and have R recognise this as a date variable. Here's what I've tried:
library("lubridate")
Data$Month <- parse_date_time(Data$Month, "%b-%y")
# This converts the dates to yyyy-mm-dd, with all day values defaulting to 01 as I do not have values for them.
To get R to recognise this as a date variable I did the following:
Data$Month<-as.Date(Data$Month, "%Y-%m-%d")
I do not have values for the days and do not want these to appear, so to remove them and display the dates how I want, mm/yyyy, I did the following:
This successfully formats my dates as mm/yyyy, but changes the variable type to character instead of date. Is there a way to set the variable type back to date, whilst keeping this format?
I think you're largely on the right track. But, what you're looking for is the sort of thing that you can do in Excel or Google Sheets where the underlying data (in this case, a date that just uses "1" for the day of the month) is separate from the formatting (display) of the cell. Right?
In R, there really isn't that concept of "the data" vs. "the display" in a data frame. What I typically do when I need something like this is to have two columns—one that holds the date as a date, and one that holds a string (or factor) that is the formatted display version. So, you could have Month as a date and Month_Display as a string or factor. As long as they stay in synch, you can then use the former for any date-based operations and the latter for any display operations.
Alternatively, you can keep the single date column (with the 1st of the month in it), but then do a quick format() wherever you're going to display the data in output or wherever you'd like it to be shown as %m/&Y.
That's not the sort of answer you're hoping for, I'm afraid. But, hopefully, it helps nonetheless.