I have a survey with 12 strata in it (the geographic area was split into the 12 strata) and am interested in computing point estimates for the entire geographic area. Power allocation was used to determine the sampling rates. In order to address non-response bias
- A baseweight was first calculated for each stratum where the baseweight upweights or downweights the responses from a given stratum to align them with the household count in each stratum (e.g., if only 105 surveys were returned from a stratum with 7217 households out of 116,487 total, and only 950 surveys total were returned, then the baseweight for this stratum would be 0.56 = (7217 / 116487) / (105 / 950)).
- Raking of weights (by stratum) was accomplished using the
ANESrake
package and using the baseweights computed as input to the call toanesrake
.
Now that I have the raked weights which should (in theory) account for non-response bias, I would like to perform the survey analysis using the R survey
package. I have specified the survey design as follows:
design = svydesign(ids=~SID,
strata=~Stratum,
fpc=~FPC,
weights=~raked_weights,
data=my_data)
I then calculate the estimate for the area (for different age groups) using the following code:
area_estimates = svyby(formula=~Weight_Class,
by=~Age_Group,
FUN=svymean,
design=design,
keep.names=FALSE,
vartype='ci',
method='beta')
Am I correct in saying that, by specifying the weights to use in the survey design, that the standard error of the estimates (used in the confidence intervals) will be correct? Or is there something else I need to specify to tell the survey
package that I raked the weights?