How to subset a multiply imputed list of dataframes?

I used MICE to do a multiple imputation with 50 iterations on a large dataset that had some missing. Now, I have three exclusion criteria that I need to apply to these datasets before I can do my analyses.

  1. resident=1,
  2. ages 15-49, and
  3. remove values of the variable county_fips equal to "99999."

From another forum, I have this code that works for the first step:

uhbs_clean <- with(uhbs_imp8,
             {dat <- data.frame(bfacil,hosp_ob,AgeCont,mateduc,race_hispan,urban_rural3,MARITAL,prev_children,dplural,county_fips,resident)
             dat <- dat[dat$resident==1,]})

But when I try to do that same thing for the next step (use uhbs_clean as the input dataset and change the resident==1 to AgeCont>14), I get an error that the first variable, bfacil, isn't found. I also tried to do all of the steps in one code block like this:

uhbs_clean <- with(uhbs_imp8,
             {dat <- data.frame(bfacil,hosp_ob,AgeCont,mateduc,race_hispan,urban_rural3,MARITAL,prev_children,dplural,county_fips,resident)
             dat <- dat[dat$resident==1,]
             dat <- dat[dat$Agecont>14,]
             dat <- dat[dat$Agecont<50,]
             dat <- dat[dat$county_fips!="99999",]})

But that didn't work either, it said that the data frame had zero rows. Next, I tired just using subset:

uhbs_clean <- with(uhbs_imp8, subset(uhbs_imp8, uhbs_imp8$data$resident==1))

And that ran without errors, but didn't work because now all of the columns are gone.

R is not my first coding language so I am kind of stuck! This is also my first time posting here so if there are things I need to add please let me know. My data isn't publicly available so I don't know how to give an example. Any help would be really appreciated. Thanks.

Hi @joachimg1,

I'm not sure, if I get your issue right, but why not use the beautiful tidyverse packages or more specifically the dplyr package?

install.packages("dplyr")

Then, your code may look something like this:

library(dplyr)

uhbs_clean <- uhbs_imp8 %>%
              as.data.frame() %>%
              filter(
                     resident == 1,
                     Agecont %in% c(15, 49),
                     county_fips != "99999"
             )

I don't know what format uhbs_imp8 has, so as.data.frame may run into an error if the format is not fitting.

Yet, I hope this may help you :wink: If it dosen't, provide some error messages.