replicate weights. Linear model. survey

I would like to do a linear regression using a complex survey. And I need to use replicate weights. I have a dataframe with the data and another with the replicate weights, but I don't know how to proceed.

data <- data.frame ("Country" = c ("ESP", "AUT", "POR", "GRE", "ITA", "USA", "FRA", "GER", "DEN", "BRA", "AUS", "CHI"),
                     var2 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 10,11), 
                     var3 = c (1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1,0),
                     var4 = c(501, 700, 71, 800, 801, 71, 81, 91, 91, 80, 71,90),
                      my_weights  = c(14, 10, 11, 10, 18, 17, 18, 19, 10, 10, 17, 11))

replicate.weights <- data.frame (factor1 = c(12, 12, 53, 84, 95, 86, 77, 88, 99, 99, 10,110), 
                     factor2 = c (51, 50, 51, 50, 31, 31, 31, 21, 11, 10, 11,10),
                     factor3 = c (5, 10, 511, 506, 31, 341, 351, 201, 110, 101, 11,10),
                     factor4 = c (1, 5, 511, 50, 1, 31, 301, 21, 110, 101, 11,1),
                     factor5 = c (510, 10, 521, 580, 1, 31, 1, 21, 131, 10, 111,1),
                     factor6 = c (51, 1, 21, 50, 10, 31, 13, 21, 11, 10, 11,1))

When it is only with "conventional" weights I know it is possible to use weigths = my_weights as here:

lm(Y ~ X1+ X2, data = my_data , weights = my_weights)

But if I have two dataframes I don't know how to do it.
I think I need the survey library, but I can't find any example for a linear regression.

With the data you provided, I show how to do this for a regression of var2~var3+var4 It's unclear if your replicate weights have some scaling constant needed - for now it's not using one.

library(survey)
#> Warning: package 'survey' was built under R version 4.3.2
#> Loading required package: grid
#> Loading required package: Matrix
#> Warning: package 'Matrix' was built under R version 4.3.1
#> Loading required package: survival
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart
data <- data.frame ("Country" = c ("ESP", "AUT", "POR", "GRE", "ITA", "USA", "FRA", "GER", "DEN", "BRA", "AUS", "CHI"),
                    var2 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 10,11), 
                    var3 = c (1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1,0),
                    var4 = c(501, 700, 71, 800, 801, 71, 81, 91, 91, 80, 71,90),
                    my_weights  = c(14, 10, 11, 10, 18, 17, 18, 19, 10, 10, 17, 11))

replicate.weights <- data.frame (factor1 = c(12, 12, 53, 84, 95, 86, 77, 88, 99, 99, 10,110), 
                                 factor2 = c (51, 50, 51, 50, 31, 31, 31, 21, 11, 10, 11,10),
                                 factor3 = c (5, 10, 511, 506, 31, 341, 351, 201, 110, 101, 11,10),
                                 factor4 = c (1, 5, 511, 50, 1, 31, 301, 21, 110, 101, 11,1),
                                 factor5 = c (510, 10, 521, 580, 1, 31, 1, 21, 131, 10, 111,1),
                                 factor6 = c (51, 1, 21, 50, 10, 31, 13, 21, 11, 10, 11,1))

my_rep_obj <- svrepdesign(data=data, repweights=replicate.weights, weight=~my_weights, type="other", combined.weights=TRUE) 
#> Warning in svrepdesign.default(data = data, repweights = replicate.weights, :
#> scale or rscales not specified, set to 1
# I'm not sure what type of replicate weights you provided

svyglm(var2~var3+var4, design=my_rep_obj)
#> Call: svrepdesign.default(data = data, repweights = replicate.weights, 
#>     weight = ~my_weights, type = "other", combined.weights = TRUE)
#> with 6 replicates.
#> 
#> Call:  svyglm(formula = var2 ~ var3 + var4, design = my_rep_obj)
#> 
#> Coefficients:
#> (Intercept)         var3         var4  
#>   10.129459    -2.148303    -0.007404  
#> 
#> Degrees of Freedom: 11 Total (i.e. Null);  3 Residual
#> Null Deviance:       117.2 
#> Residual Deviance: 61.52     AIC: 62.08

Created on 2023-12-28 with reprex v2.0.2

Some documentation on linear regression in the survey package: Regression models

1 Like

Thanks a lot! Finally I am starting to understand the sintaxis.
Do you know any book or web with examples about this topic? I´ve only found "Complex Surveys" by Thomas Lumley

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.