Hallo everyone,
I am performing multiple imputation on my dataset with the R mice package prior to carrying out my primary analysis which is a binary logistic regression with functional decline as outcome and a number of variables as predictors (continuous and binary). I am using pmm for continuous variables/logreg for binary variables, m=5, maxit=10 and a fixed seed for reproducibility.
I need to perform a sensitivity analysis with the delta-adjustment method to see if my results are robust to potential violations of the MAR assumption.
In particular, the hypothesis is that those who miss the outcome are more likely to decline than those who don’t. The percentage of decliners among those for whom the variable functional decline is observed is 26%. Assuming (from the literature) the percentage of functional decline to rise to 60% among those for whom the variable is missing, following Rezvan et al. [(Sensitivity analysis within multiple imputation framework using delta-adjustment: Application to Longitudinal Study of Australian Children | Longitudinal and Life Course Studies, p 266)] I calculated delta=log (4.3).The formula can be found at the end of the question, if of interest.
I am a long-term SPSS user who is only now shifting to R and I still struggle with writing my own syntax. After an extensive search in the internet, the only already available syntax I found for customizing the mice.impute.logreg function with the delta parameter for a binary categorical variable was in the supplementary material (webappendix 2, p 9) of this paper by Leacy et al [https://doi.org/10.1093/aje/kww107)]. I fit the code to my case as shown below.
# customizing the mice.impute.logreg function according to Leacy et al.
library(mice)
mice.impute.logreg.sens <- function(y, ry, x, delta,...) {
x <- cbind(1, as.matrix(x))
expr <- expression(glm.fit(x[ry, ], y[ry],
family = binomial(link = logit)))
fit <- suppressWarnings(eval(expr))
fit.sum <- summary.glm(fit)
beta <- coef(fit)
beta[1]<-beta[1]+ delta
rv <- t(chol(fit.sum$cov.unscaled))
beta.star<-beta + rv %*% rnorm(ncol(rv))
p <- 1/(1 + exp(-(x[!ry, ] %*% beta.star)))
vec <- (runif(nrow(p)) <= p)
vec[vec] <- 1
if (is.factor(y)) {
vec <- factor(vec, c(0, 1), levels(y))
}
return (vec)}
#applying it to my dataset and carrying out the binary logistic regression (note: here I have simplified the dataset for the reprex)
myvars<-c ("FunctionalDecline", "Age", "Sex", "GDS4")
realdataset<-dataset [myvars]
realdataset$FunctionalDecline<-factor(realdataset$FunctionalDecline)
realdataset$Sex<-factor(realdataset$Sex)
ini<-mice(realdataset, maxit=0)
meth<-ini$meth
meth["FunctionalDecline"]="logreg.sens"
imputeddataset<-mice(realdataset, meth=meth,seed=3, m=5, maxit=10, delta=log(4.3), print=F)
summary(pool(with(imputeddataset, glm (FunctionalDecline~Age+ Sex + GDS4, family=binomial))))
The syntax runs without error messages but I have several issues
1) The results of the binary logistic regression after the delta adjustment are unexpected: I lose significance for all predictors relative to the binary logistic regression on the imputed dataset on which no sensitivity analysis is carried out. I did a “rougher” (worst-scenario) sensitivity analysis by
(i) doing the imputation, once the imputation is complete putting all imputed functional decline values equal to 1 (i.e. decline) and then running the binary logistic regression
ii) doing the imputation, using the “post” function from mice, which post-processes the imputed values within each iteration [ https://doi.org/10.18637/jss.v045.i03, p32-34], so as to put all imputed functional decline values equal to 1 and the running the binary logistic regression
In both cases (i) and (ii) I get similar results from the binary logistic regression, with most predictors remaining significant
2) In order to approximate the “worst-scenario” analysis, if I assume 99% of decliners among the missing then delta is 281.7. If I put delta =281.7 again I lose significance for all predictors in the subsequent binary logistic regression
3) If I put delta=log(1.0) (i.e.no adjustment=MAR) again all predictors lose significance in the subsequent logistic regression
I have tried increasing the number of iterations and of imputations (maxit=20, m=50) and changing the visiting sequence but the results remain substantially unchanged. Also, when I ran sensitivity analyses for some of the (continuous) variables in my dataset (following the code in [https://www.gerkovink.com/miceVignettes/Sensitivity_analysis/Sensitivity_analysis.html] I found no such inconsistencies.
Thank you very much for any help you can give me!
PS: I am new to the forum and if the question is unclear or not detailed enough (I have simplified the dataset for the example) I will be happy to edit the post based on your indications.
Formula in Rezvan et al.
delta=log [(π missing /1- π missing)/( π observed/1- π observed)]
where π missing is the expected proportion of functional decline in those with missing data on functional decline, and π observed is the proportion of functional decline found among those with observed data on functional decline