I'm trying to create a new variable in a dataset under some conditions of other variables. Basically, I want to simplify the information about education of parents, that is split between father and mother, and create a new one, that takes in account the highest level of education of the parents. For example, if the father education level is 1 and mother education is 0, the value for this row in the new variable would be 1.
I'm trying to use mutate()
with case_when()
functions, that worked in another variable, but I'm not understanding why isn't right now. When I try, it creates a column with only NA's and when I print a table from it, the result is:
< table of extent 0 >
The class of the two variables that I'm using for conditions is 'labelled' and 'factor'.
First, I tried the following command (I'm simplifying the codes):
dataset <- dataset %>%
mutate(NEW_EDUCATIONAL_VAR = case_when(MOTHER_EDUCATIONAL_VAR == '0' & FATHER_EDUCATIONAL_VAR == '0' ~ '0',
MOTHER_EDUCATIONAL_VAR == '0' & FATHER_EDUCATIONAL_VAR == '1' ~ '1')
Then, I tried to consider the cases that has NA values, since there is NA in some rows:
dataset <- dataset %>%
mutate(NEW_EDUCATIONAL_VAR = case_when(is.na(MOTHER_EDUCATIONAL_VAR) & is.na(FATHER_EDUCATIONAL_VAR) ~ '99',
MOTHER_EDUCATIONAL_VAR == '0' & FATHER_EDUCATIONAL_VAR == '1' ~ '1')
When I used these functions to create a new one for the age of the cases, it worked.
dataset <- dataset %>% mutate(AGE_CAT = case_when(AGE >= 16 & AGE <= 18 ~ '0',
AGE >= 19 & AGE <= 24 ~ '1',
AGE >= 25 & AGE <= 29 ~ '2',
AGE >= 30 ~ '3'))
So, what am I doing wrong? Thanks a lot.