I fitted a model using the RefBasedMI package to Implement Copy-Reference Method for Missing Data Imputation. I later used mice to perform MMRM model on my data. Can someone help me extract LS‑means (SE) , CI of LS-means for each treatment group and treatment difference (SE) , CI of treatment difference, and p-value at Time = 2?
Thanks
Alfred
Below is my program and simulated dataset:
#####################################Program#####################################
1. Install and load the package
install.packages("RefBasedMI")
library(RefBasedMI)
2. Example Data Structure
'my_data' should be in long format with columns:
ID, Treatment, Time, Baseline_Covar, and Outcome
Outcome has missing values (NA) after dropout.
#Prepare dataset
data.new <- data %>%
mutate(
id = as.numeric(Patient),
trt = ifelse(Group=="Control",0,1),
sex = ifelse(Sex=="M",1,2),
t = as.numeric(Time)
)
data.new <- data.new[order(data.new$Patient), ]
3. Perform Copy-Reference Imputation
imputed_results <- RefBasedMI(
data = data.new,
depvar = UPDRS, # Dependent variable to impute
treatvar = trt, # Treatment group variable
idvar = id, # Unique participant identifier
timevar = t, # Time points (longitudinal)
covar = c(Age,sex), # Complete baseline covariates
method = "CR", # "CR" stands for Copy-Reference
reference = 0, # Name of the control/reference arm
M = 10, # Number of imputed datasets to create
seed = 123 # For reproducibility
)
4. View Results
The output is a data frame containing M imputed datasets
Stacked under the original data (where .imp > 0)
Fit regression model to each imputed data set by treating output data frame as mids object
mmrm_fits <- with(data=as.mids(imputed_results),
mmrm(UPDRS ~ Group * Time + Age + Sex + us(Time | Patient)))
###############################Simulated Dataset#####################################
set.seed(456)
library(tidyverse)
library(dplyr)
Non-randomized observational study: older and more females tend to get treatment
n_patients <- 400
patients <- 1:n_patients
times <- c(0, 1, 2) # baseline, 1 year, 2 years
Non-random group assignment: function for treatment probability
prob_treatment <- function(age, sex) {
0.3 + 0.005 * (age - 65) + 0.15 * (sex == "F")
}
Create baseline covariates first
baseline_covariates <- data.frame(
Patient = as.factor(patients),
Age = rnorm(n_patients, mean = 70, sd = 12), # Older population
Sex = sample(c("M", "F"), n_patients, replace = TRUE, prob = c(0.4, 0.6)) # More females
)
Assign groups based on age and sex (non-random, simulating selection bias)
baseline_covariates$Group <- as.factor(sapply(1:n_patients, function(i) {
if (runif(1) < prob_treatment(baseline_covariates$Age[i], baseline_covariates$Sex[i])) {
"Treatment"
} else {
"Control"
}
}))
Create longitudinal data frame
data <- expand.grid(Patient = patients, Time = times)
data$Patient <- as.factor(data$Patient)
data$Time <- as.factor(data$Time)
Merge with baseline covariates
data <- data %>% left_join(baseline_covariates, by = "Patient")
Simulate UPDRS scores
data$UPDRS <- rnorm(nrow(data), mean = 35 + 5 * as.numeric(as.character(data$Time)) + ifelse(data$Group == "Treatment", -1.5, 0), sd = 6)
Introduce missing data (MAR)
data$UPDRS <- ifelse(runif(nrow(data)) < 0.25 & data$Time != "0", NA, data$UPDRS)
cat("Dataset: Non-Randomized Observational Study\n")