I have two questions, one more practical and the other more conceptual.
I want to run regression models using data from Demographic and Health Surveys (DHS) and then bootstrap predictions based on the model. The data sets come with weights to ensure representativeness of the population surveyed. My practical question is how do I make rsample
take account of these weights when bootstrapping?
The more conceptual question is what level to "strata" at. The DHS data are collected using two-stage cluster sampling (https://dhsprogram.com/pubs/pdf/WP30/WP30.pdf). Each cluster has 5-40 observations depending on survey and my data cleaning/selection. All observations within a cluster have the same weight. It would seem that strata should be at the cluster level, but this level is way below 15% of the survey sample (normally around 6,000). Strata at cluster level would solve my weighting question, but substitute a new one if all strata of less than 15% are pooled.
Thank you for any help/comments/suggestions,
Claus