hi, i have a dataframe with the column birth date, this column is as numerical, for example 1041998, i need to transform it to 01-04-1998 day-month-year or similar. i tried to use as.data but the result was just NA's
Hi! Did you mean "as.Date" rather than "as.data"?
yes, sorry i used as.date
I prefer to use tidyverse and lubridate to solve this problem.
# packages
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.1.2
#> Warning: package 'readr' was built under R version 4.1.2
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
# the data frame
my_data <- tribble(
~birth_date,
01011999,
01021998,
02011997,
03041996
)
# convert from number to date
my_converted_data <- my_data %>%
mutate(birth_date_as_date = mdy(birth_date)) # use the lubridate::mdy function
my_converted_data
#> # A tibble: 4 x 2
#> birth_date birth_date_as_date
#> <dbl> <date>
#> 1 1011999 1999-01-01
#> 2 1021998 1998-01-02
#> 3 2011997 1997-02-01
#> 4 3041996 1996-03-04
Created on 2021-12-29 by the reprex package (v2.0.1)
Great solution! If you choose to use as.Date, the original variable needs to be character rather than numeric.
And I assume the variable you're using has 8 characters rather than the 7 you've provided as an example, so 10041998 rather than 1041998. Otherwise, you'll need to pad single digits with zeros, which may mean breaking the date variable into subparts (i.e., month, day, year).
i tried your solution but recieved a warning message
ptje_2020m <- ptje_2020 %>%
mutate(FECHA_NACIMIENTO_as_date = mdy(FECHA_NACIMIENTO))
Warning message:
Problem withmutate()
columnFECHA_NACIMIENTO_as_date
.
FECHA_NACIMIENTO_as_date = mdy(FECHA_NACIMIENTO)
.
186282 failed to parse.
i have dates with 8 or 7 characters, such as 1041998 that would be 01/04/1998 in d,m,y format
Is your date in mdy format? Or is it in dmy format? If in dmy then try the dmy()
function from lubridate. In my response I am assuming mdy.
thank you, it works perfectly
i'm sorry to bother again, but i need to calculate the age, i tried to use
ptje_2020 <- ptje_2020 %>%
mutate(AGE = age_calc(FECHA_NACIMIENTO_as_date,units = "years"))
but it gave the following error
Error: Problem with mutate()
column AGE
.
AGE = age_calc(FECHA_NACIMIENTO_as_date, units = "years")
.
x missing value where TRUE/FALSE needed
Backtrace:
-
%>%
(...) - base::.handleSimpleError(...)
- dplyr:::h(simpleError(msg, call))
I'm not familiar with age_calc
, but this is how I would calculate an age, or the exact length of a time span. This is the same as what I posted above, but this shows how to calculate the age.
# packages
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.1.2
#> Warning: package 'tibble' was built under R version 4.1.2
#> Warning: package 'readr' was built under R version 4.1.2
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
# the data frame
my_data <- tribble(
~birth_date,
01011999,
01021998,
02011997,
03041996
)
# convert from number to date
my_converted_data <- my_data %>%
mutate(birth_date_as_date = dmy(birth_date), # use the lubridate::dmy function
today_date = today()) %>% # create variable for today
mutate(age = time_length(x = difftime(today_date, # calculate interval between
birth_date_as_date),
unit = "years"))
# view
my_converted_data
#> # A tibble: 4 x 4
#> birth_date birth_date_as_date today_date age
#> <dbl> <date> <date> <dbl>
#> 1 1011999 1999-01-01 2021-12-30 23.0
#> 2 1021998 1998-02-01 2021-12-30 23.9
#> 3 2011997 1997-01-02 2021-12-30 25.0
#> 4 3041996 1996-04-03 2021-12-30 25.7
Created on 2021-12-30 by the reprex package (v2.0.1)
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.