Using forloop to insert string values into cox regression function for multivariate analysis

I am working on this data mining project and came upon some troubles. For multivariate analysis, I have been using this function:
res.cox <- coxph(Surv(OS, Status) ~ Sex + RANDOMGENE, mydata)
summary(res.cox)

The variables listed out are the headers of the columns found in excel. For the RANDOM GENE variable, I would put specific genetic mutations that I would want to analyze for greater survival. This function works perfectly. However, since I have many genes to analyze, I thought it would serve me better to try to automate the process. Therefore, I decided it would be better to use a for loop but always got the error "variables length differ". Below is the example code I still need to fix. Basically, I need this for loop to "feed" the genes into the coxph function but again could not figure out why this would not work.

Here is some dummy data:

mydata <- read.table(header=T, text="ATRX    IDH1   VEGF    Sex    survival    vital.status
2   1   1   2   1.419178082 2
2   1   1   1   5   1
2   1   1   2   1.082191781 2
1   1   1   1   0.038356164 1
2   1   2   2   0.77260274  2
1   1   2   2   2.336986301 1
2   1   2   1   1.271232877 1")

Here is the code that I still need to work on.
Code:
genes <- names(mydata[1:3])
for (i in 1:length(genes))
{
print(coxph(Surv(survival, vital.status) ~ Sex + genes[i], mydata))
}

This is the error that I get:
Error in model.frame.default(formula = Surv(survival, vital.status) ~ :
variable lengths differ (found for 'genes[i]')

How about this?

library(survival)

mydata <- read.table(header=T, text="ATRX    IDH1   VEGF    Sex    survival    vital.status
2   1   1   2   1.419178082 2
2   1   1   1   5   1
2   1   1   2   1.082191781 2
1   1   1   1   0.038356164 1
2   1   2   2   0.77260274  2
1   1   2   2   2.336986301 1
2   1   2   1   1.271232877 1")

genes <- names(mydata[1:3])

purrr::map(genes, ~coxph(as.formula(paste("Surv(survival, vital.status) ~  Sex + ", .x)), mydata))
#> [[1]]
#> Call:
#> coxph(formula = as.formula(paste("Surv(survival, vital.status) ~  Sex + ", 
#>     .x)), data = mydata)
#> 
#>           coef exp(coef)  se(coef)     z     p
#> Sex  2.166e+01 2.544e+09 3.089e+04 0.001 0.999
#> ATRX 2.172e+01 2.701e+09 3.838e+04 0.001 1.000
#> 
#> Likelihood ratio test=5.42  on 2 df, p=0.06667
#> n= 7, number of events= 3 
#> 
#> [[2]]
#> Call:
#> coxph(formula = as.formula(paste("Surv(survival, vital.status) ~  Sex + ", 
#>     .x)), data = mydata)
#> 
#>           coef exp(coef)  se(coef)     z     p
#> Sex  2.074e+01 1.015e+09 2.467e+04 0.001 0.999
#> IDH1        NA        NA 0.000e+00    NA    NA
#> 
#> Likelihood ratio test=2.64  on 1 df, p=0.104
#> n= 7, number of events= 3 
#> 
#> [[3]]
#> Call:
#> coxph(formula = as.formula(paste("Surv(survival, vital.status) ~  Sex + ", 
#>     .x)), data = mydata)
#> 
#>            coef  exp(coef)   se(coef)      z     p
#> Sex   2.078e+01  1.064e+09  2.475e+04  0.001 0.999
#> VEGF -4.812e-01  6.180e-01  1.238e+00 -0.389 0.697
#> 
#> Likelihood ratio test=2.8  on 2 df, p=0.2464
#> n= 7, number of events= 3

Created on 2019-07-17 by the reprex package (v0.3.0)

2 Likes

Thank you very much!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.