How can I create a composite (binary) variable from several existing variables?

Hi,
Suppose I want to create a new variable heart_disease with two levels (0 and 1) from three existing binary variables (heart_failure, angina, MI). The variable heart_disease will be coded as 1 if any of the existing variables have value of 1; else heart_disease = 0. How do I code this? Please see the dataset blurb below.
Thank you so much!
Javaid

heart failure angina MI heart_disease
1 0 0 1
0 0 0 0
1 1 0 1
1 0 1 1
1 0 0 1
0 1 1 1
0 0 0 0
0 1 0 1

Here are two methods.

library(dplyr)
DF <- read.csv("~/R/Play/Dummy.csv")
DF2 <- DF %>% mutate(Heart_Disease=as.numeric(heart_failure+angina+MI > 0))
DF2                        
  heart_failure angina MI Heart_Disease
1             1      0  0             1
2             0      0  0             0
3             1      1  0             1
4             1      0  1             1
5             1      0  0             1
6             0      1  1             1
7             0      0  0             0
8             0      1  0             1
 
#Assume the initial data does not include Heart_Disease
DF3 <- DF  %>% 
   mutate(Heart_Disease=as.numeric(rowSums(.)>0))
DF3
  heart_failure angina MI Heart_Disease
1             1      0  0             1
2             0      0  0             0
3             1      1  0             1
4             1      0  1             1
5             1      0  0             1
6             0      1  1             1
7             0      0  0             0
8             0      1  0             1
1 Like
DF <- data.frame(
  heart_failure = c(1, 0, 1, 1, 1, 0, 0, 0), 
  angina = c(0, 
  0, 1, 0, 0, 1, 0, 1), 
  MI = c(0, 0, 0, 1, 0, 1, 0, 0)
  )


DF["heart_disease"] <- ifelse(rowSums(DF, na.rm = TRUE) > 0,1,0)
DF
#>   heart_failure angina MI heart_disease
#> 1             1      0  0             1
#> 2             0      0  0             0
#> 3             1      1  0             1
#> 4             1      0  1             1
#> 5             1      0  0             1
#> 6             0      1  1             1
#> 7             0      0  0             0
#> 8             0      1  0             1
1 Like

Thank you, @FJCC! Much appreciated!

Thank you, @technocrat!

Hi @FJCC,
Thank you so much for the help! What if my existing variables are stored as factor variables, each with two levels -- 0 = "No", 1 = "Yes". I am sorry, I should have been more clear.
Thank you!

Please post a small data set as you did above but use the output of the dput() function. If your small data set is called DF, run dput(DF) and paste the output between lines that contain only three back ticks.
```
Paste output here
```

Sorry for that, please see below:

structure(list(heart_failure = c("Yes", "No", "Yes", "Yes", "No", 
"No", "No"), angina = c("No", "No", "Yes", "No", "No", "Yes", 
"No"), MI = c("No", "No", "No", "Yes", "No", "Yes", "No"), heart_disease = c("Yes", 
"No", "Yes", "Yes", "No", "Yes", "No")), class = "data.frame", row.names = c(NA, 
-7L)

I would use a variation of @technocrat's method.

DF <- structure(list(heart_failure = c("Yes", "No", "Yes", "Yes", "No", "No", "No"), 
                     angina = c("No", "No", "Yes", "No", "No", "Yes","No"), 
                     MI = c("No", "No", "No", "Yes", "No", "Yes", "No"), 
                     heart_disease = c("Yes", "No", "Yes", "Yes", "No", "Yes", "No")), 
                class = "data.frame", row.names = c(NA, -7L))
DF <-  DF[, -4] #drop the exisitng heart_disease column
DF["heart_disease"] <- ifelse(rowSums(DF == "Yes", na.rm = TRUE) > 0,"Yes", "No")

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.