I have a factor variable called disease severity with 6 levels: asymptotic, healthy control,mild,moderate, severe, critical severe.
I want to form a new factor variable with four levels: asymptotic,combine(mild & moderate), severe, critical severe & ignore healthy control.
I can do this with an if else if statement but is there an easier way(efficient code) to do this?
You can just set the factor levels you aren't interested in to NA
See the example below as a guide.
set.seed(123)
x <- factor(sample(letters[1:6], 20, TRUE))
x
#> [1] c f c b b f c e d f f a b c e c c a d a
#> Levels: a b c d e f
levels(x)[c(3, 5)] <- NA
x
#> [1] <NA> f <NA> b b f <NA> <NA> d f f a b <NA> <NA>
#> [16] <NA> <NA> a d a
#> Levels: a b d f
set.seed(123)
x <- factor(sample(letters[1:6], 20, TRUE))
x
#> [1] c f c b b f c e d f f a b c e c c a d a
#> Levels: a b c d e f
levels(x)[c(3, 5)] <- levels(x)[c(2, 4)]
x
#> [1] b f b b b f b d d f f a b b d b b a d a
#> Levels: a b d f
set.seed(123)
x <- factor(sample(letters[1:6], 20, TRUE))
x
forcats::fct_collapse(x,
c_d = c("c","d"))
# [1] c_d f c_d b b f c_d e c_d f f a b c_d e c_d c_d a c_d a
# Levels: a b c_d e f