I have a data frame in my environment called Covars, which has subject-level information pertinent to various subjects. From this I made a linear model for predicting missing weight and height values based on other, non-missing columns. All was well. Then I tried to use such columns and make a vector of predicted values for imputing missing creatinine values, just analogously to how I had imputed values for weight and height. However, now it complains that it can't find a variable that it had no trouble finding earlier.
# 3rd model: Weight as function of age, sex, patient, country
lm.Wt.vs.AgeSexPatientCountry <-
lm(Covars$WEIGHT ~ Covars$AGE + Covars$SEXC + Covars$AGE*Covars$SEXC +
Covars$PATIENT + Covars$COUNTRY,
na.action = na.exclude)
summary(lm.Wt.vs.AgeSexPatientCountry) # adjusted R-squared: 0.3611
# I skip showing the output; all is well so far.
predWEIGHT <- predict.lm(object = lm.Wt.vs.AgeSexPatientCountry,
newdata = data.frame(Covars$AGE, # no NA's in AGE
Covars$SEXC, # no NA's in SEXC
Covars$PATIENT, # no NA's
Covars$COUNTRY) # no NA's
) # 29295 large numeric named vector
# predWEIGHT gets made, 29295 elements. All is well still.
sum(is.na(predWEIGHT)) # 0
summary(Covars$WEIGHT - predWEIGHT)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# -60.324 -10.223 -1.444 0.000 8.461 118.908
# Covars$SEXC clearly exists and was found above, but now this:
# Try both race and country.
lm.CREATvSexAge2.LBM.S_Lint.PatientRaceCountry <-
lm(CREAT ~ SEXC + AGE + I(AGE^2) + LBM + SEXC:LBM + PATIENT + RACE2 + COUNTRY,
data = Covars)
summary(lm.CREATvSexAge2.LBM.S_Lint.PatientRaceCountry) # Adj. R^2: 0.2523
plot(x = lm.CREATvSexAge2.LBM.S_Lint.PatientRaceCountry$fitted.values,
# SEXC + AGE + I(AGE^2) + LBM + SEXC:LBM + PATIENT + RACE2 + COUNTRY
y = lm.CREATvSexAge2.LBM.S_Lint.PatientRaceCountry$residuals +
lm.CREATvSexAge2.LBM.S_Lint.PatientRaceCountry$fitted.values,
# CREAT itself
panel.first = grid(8,8),
ylim = c(0, 250),
xlab = "model fitted values of SEX, AGE, LBM, PATIENT, RACE, COUNTRY",
ylab = "creatinine (μmol/L)",
#log = "y",
pch = '.', cex = 0.1, col = "blue")
# The plot works fine.
# The summary on this linear model and the above plot work fine, but the predicted CREAT does not:
### Now impute CREAT values.
predCREAT <- predict.lm(object = lm.CREATvSexAge2.LBM.S_Lint.PatientRaceCountry,
newdata = data.frame(Covars$AGE,
Covars$SEXC,
Covars$LBM,
Covars$PATIENT,
Covars$RACE2,
Covars$COUNTRY)
)
# Error in eval(predvars, data, env) : object 'SEXC' not found
This error is very strange considering my successful predWEIGHT above. Can someone please help me?