I'm doing an assignment for class, and we are using a dataset in xlsx format. The dataset consists of an observation_date variable, and one other variable. When I uploaded the dataset into R, the observation_date appears as numbers, rather than 10/1/2010 for example. How do I change the data in the observation_date variable to make it appear as an actual date instead of 44033 for example?
str(myData)
tibble [253 × 2] (S3: tbl_df/tbl/data.frame)
observation_date: num [1:253] 44033 44034 44035 44036 44039 ...
SP500 : num [1:253] 3257 3276 3236 3216 3239 ...
myData$Date <- as.Date(myData$Date, format = "%m/%d/%Y")
Error: Assigned dataas.Date(myData$Date, format = "%m/%d/%Y")
must be compatible with existing data.
Existing data has 253 rows.
Assigned data has 0 rows.
Only vectors of size 1 are recycled.
Runrlang::last_error()
to see where the error occurred.
In addition: Warning message:
Unknown or uninitialised column:Date
.
myData$observation_date <- as.observation_date(myData$observation_date, format = "%m/%d/%Y")
Error in as.observation_date(myData$observation_date, format = "%m/%d/%Y") :
could not find function "as.observation_date"
myData$observation_date <- as.Date(myData$observation_date, format = "%m/%d/%Y")
Error in as.Date.numeric(myData$observation_date, format = "%m/%d/%Y") :
'origin' must be supplied
This was the code I tried to attempt to change it
I'm not sure specifically what you mean by "appear as an actual date", but
as.Date(44348, origin="1900-01-01")
[1] "2021-06-03"
Does that help?
I'm trying to change all of the dates in the dataset from appearing like "44348" to actually appearing like dates "1990-01-01".
You might try:
as.Date(myData$observation_date, origin="1900-01-01")
of course.
head(myData)
A tibble: 6 x 2
observation_date SP500
1 44033 3257.
2 44034 3276.
3 44035 3236.
4 44036 3216.
5 44039 3239.
6 44040 3218.
try this;
myData %>%
mutate(Datex = map_chr(observation_date , as.character))
I don't think that changed the format of the observation date. It still appears the same when I view(myData) or use the head(myData) commands.
if you could share your sample data in a link such as googledrive will help in getting the result or a reproducible example. As you read the file, the column seems to be numeric.
So,
try
library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
its in the correct format before I imported the file into R. But once I displayed the file in R after I imported it, the date changes to the numeric value that I don't want
library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
This is what happened when I tried your suggestion
Excel represents dates as the number of days since an origin date, try this
library(dplyr)
myData <- myData %>%
mutate(Date = as.Date(Date, origin="1899-12-30"))
library(dplyr)
myData <- myData %>%
- mutate(Date = as.Date(Date, origin="1899-12-30"))
Error: Problem withmutate()
columnDate
.
Date = as.Date(Date, origin = "1899-12-30")
.
do not know how to convert 'Date' to class “Date”
Runrlang::last_error()
to see where the error occurred.
not sure how to share the data from googledrive link on here
To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:
here is the link:
here is the code:
library(readxl)
myData <- read_excel("Downloads/SP500Correct.xlsx")
Error:path
does not exist: ‘Downloads/SP500Correct.xlsx’
myData <- read_excel("Downloads/SP500Correct.xlsx")
Error:path
does not exist: ‘Downloads/SP500Correct.xlsx’
library(readxl)
myData <- read_excel("Downloads/SP500Correct.xlsx")
Error:path
does not exist: ‘Downloads/SP500Correct.xlsx’
myData <- read_excel("Downloads/SP500Correct.xlsx")
view(myData)
nrow(myData)
[1] 253
head(myData)
A tibble: 6 x 2
observation_date SP500
1 44033 3257.
2 44034 3276.
3 44035 3236.
4 44036 3216.
5 44039 3239.
6 44040 3218.
str(myData)
tibble [253 × 2] (S3: tbl_df/tbl/data.frame)
observation_date: num [1:253] 44033 44034 44035 44036 44039 ... SP500 : num [1:253] 3257 3276 3236 3216 3239 ...
myData$Date <- as.Date(myData$Date, format = "%m/%d/%Y")
Error: Assigned dataas.Date(myData$Date, format = "%m/%d/%Y")
must be compatible with existing data.
Existing data has 253 rows.
Assigned data has 0 rows.
Only vectors of size 1 are recycled.
Runrlang::last_error()
to see where the error occurred.
In addition: Warning message:
Unknown or uninitialised column:Date
.
myData$observation_date <- as.observation_date(myData$observation_date, format = "%m/%d/%Y")
Error in as.observation_date(myData$observation_date, format = "%m/%d/%Y") :
could not find function "as.observation_date"
myData$observation_date <- as.Date(myData$observation_date, format = "%m/%d/%Y")
Error in as.Date.numeric(myData$observation_date, format = "%m/%d/%Y") :
'origin' must be suppliedview(myData)
myData %>%
- mutate(Datex = map_chr(observation_date , as.character))
A tibble: 253 x 3
observation_date SP500 Datex
1 44033 3257. 44033
2 44034 3276. 44034
3 44035 3236. 44035
4 44036 3216. 44036
5 44039 3239. 44039
6 44040 3218. 44040
7 44041 3258. 44041
8 44042 3246. 44042
9 44043 3271. 44043
10 44046 3295. 44046
… with 243 more rows
view(myData)
head(myData)
A tibble: 6 x 2
observation_date SP500
1 44033 3257.
2 44034 3276.
3 44035 3236.
4 44036 3216.
5 44039 3239.
6 44040 3218.
library(lubridate)
Attaching package: ‘lubridate’
The following objects are masked from ‘package:base’:
date, intersect, setdiff, union
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
library(dplyr)myData <- myData %>%
-
mutate(Date = as.Date(Date, origin="1899-12-30"))
Error: Problem with mutate()
column Date
.
Date = as.Date(Date, origin = "1899-12-30")
.
do not know how to convert 'Date' to class “Date”
Run rlang::last_error()
to see where the error occurred.
library(dplyr)
myData <- myData %>%
- mutate(Date = as.Date(Date, origin="1899-12-30"))
Error: Problem withmutate()
columnDate
.
Date = as.Date(Date, origin = "1899-12-30")
.
do not know how to convert 'Date' to class “Date”
Runrlang::last_error()
to see where the error occurred.
Did you try:
myData$observation_date <- as.Date(myData$observation_date, origin="1900-01-01")
That should change the original column in your myData
data frame.
This solution worked for me. Thank you so much
There are lots of great solutions here already but just to also mention the {janitor} package which works really well with tidyverse packages and has a function convert_to_date() that specifically handles the Excel serial numbers to dates problem. It's a great package that has some other cleaning functions and is really easy to use.