One-way Frequency table with complex survey data

Hello,
I just want to briefly mention that I’m fairly new to R, and this is the first analysis I try to run in the context of a complex survey design. Below a description of my problem along with a few questions.
I want to output a one-way frequency analysis. The data set contains survey weights (WTS_S), along with 500 bootstrap weights (BSW1-BSW500). The variable of interest is PROVINCE. The method to estimate variance is BRR. I’ using the following code to capture the survey design.
CCHSDesign = svrepdesign(data = myData,
weights = ~WTS_S,
repweights = "BSW[1-500]",
combined.weights = F,
type = "BRR")

I use combined.weights = F since repweights does not contain the sampling weights. Neverthelss, I get the following warning message:
Warning message:
In svrepdesign.default(data = myData, weights = ~WTS_S, repweights = "BSW[1-500]", :
Data look like combined weights: mean replication weight is 1303.63701983038 and mean sampling weight is 1302.86018957346

The previous message does no appear when I used combined.weights = F, which seems not to be appropriate in my situation.
Question 1: Is the previous warning message of any concern to me?
Question 2: Why do I get “Balanced Repeated Replicates with 456 replicates.” instead of 500 when I run CCHSDesign?

In addition to the frequencies by category of PROVINCE, I also want the standard errors (SEs). Running the following code provides the frequencies, but not the SEs.
prop.table(svytable(~PROVINCE,design=CCHSDesign))*100

Question 3: How can I get the SEs in addition to the frequencies by categories of PROVINCE?
Thanks a lot for your support,
A.G.

Welcome to the forum.
I think we need more information and sample data on the problem We need to know things such as what package(s) you are working with and the structure of the data. It may also help to know your OS and what versions of R and RStudio you are using.

See this for some suggestions on how to craft a reprex (reproducible example).

Hi jrkrideau,

Thanks for your quick reply. Please, see further details you asked me to provide.
OS: Windows 10 (64-bit)
R version: 3.6.2
R studio version: 3.5
Package: survey
Structure of data: One observation per individual, some variables of interest (e.g. age), one survey weight (WTS_S) and 500 bootstrap weights (BSW1-BSW500).

For simplification purpose, I include only a random subsample of 20 individuals, and kept 20 bootstrap weights.
So, I ran the “svrepdesign” on this subsample using now repweights = "BSW[1-20]", and still reading less than 20 bootstrap weights (i.e. 13). I copied the data below, since I cannot upload txt or csv file. Any recommendation for the future for sharing sample data is more than welcome.
Hopefully, this is going to work for you.

"AGE" "WTS_S" "BSW1" "BSW2" "BSW3" "BSW4" "BSW5" "BSW6" "BSW7" "BSW8" "BSW9" "BSW10" "BSW11" "BSW12" "BSW13" "BSW14" "BSW15" "BSW16" "BSW17" "BSW18" "BSW19" "BSW20"
61 124 0 0 480 124 119 137 184 0 0 0 0 0 131 0 95 232 141 195 0 165
71 259 0 0 260 215 677 1913 0 0 0 0 265 494 0 584 0 458 0 0 244 649
8 71 0 0 0 63 39 79 0 87 0 0 0 79 154 0 58 148 57 45 0 94
57 81 149 51 40 190 0 0 92 282 787 166 717 72 274 593 0 181 70 0 264 89
44 414 732 625 331 686 0 464 0 452 526 399 361 0 0 231 0 0 659 317 1025 0
70 3579 19176 0 3132 11170 0 3658 0 0 8499 0 0 0 7651 0 4278 18363 12073 0 0 4125
76 604 562 1195 732 744 709 0 1139 1223 536 507 1369 645 0 1081 629 0 646 524 506 0
32 2280 1887 3273 2418 2568 4373 5122 2109 0 0 2904 8608 6412 2539 0 0 4131 2842 2118 3161 2703
15 509 570 842 1448 0 595 1374 544 0 0 0 1962 0 0 0 477 0 0 457 1210 0
55 286 602 0 0 0 314 0 283 268 262 623 754 0 1001 0 0 642 590 293 0 491
43 2323 3526 0 2616 6946 2631 0 2099 8367 0 4175 2501 3023 2503 0 0 1695 6169 2804 1686 2962
45 2020 3058 4034 2455 6328 0 0 2349 2011 1948 2761 1791 0 2534 4092 0 0 6721 0 0 3787
17 1978 0 2160 0 0 11819 6682 7686 3857 2369 5824 0 0 0 1771 0 1691 1869 0 0 4076
11 385 0 430 1615 303 0 0 0 896 459 1132 0 457 401 0 1037 1355 287 356 327 534
9 475 523 503 573 0 1137 0 0 0 976 586 514 0 412 659 399 979 0 347 501 0
12 736 2175 1609 0 625 961 732 0 2034 1352 944 801 0 598 0 0 691 1379 977 1331 0
39 761 938 704 758 423 0 1937 0 1534 1080 773 0 0 813 693 0 706 0 787 0 0
18 834 2552 1006 0 0 456 834 2015 2098 679 0 0 0 0 900 492 707 698 1938 0 1373
14 1365 1385 1565 1612 0 1430 0 3195 0 0 1649 1174 2582 4827 1772 2806 1075 0 0 0 2783
58 1417 0 1691 0 0 0 4075 2761 1437 5241 0 0 1552 2915 1965 0 0 0 0 1403 1888

Thanks a lot again,
AG.

Thanks. I suspect this is a bit too much outh of my area for me to be much help---I don't think I have ever used survey but it gives me and other reader a bit of an idea of what you are doing. However if I can refer you bark to the reprex link above, we really probably should see what code you are using. you can just enclose the code between two lines with a ``` on each line to present the code.

I can read in your sample data put it is better te present it in an R format. A handy way to supply sample data is to use the dput() function. See ?dput. For your sample dataset then head(dput(myfile, 20), and then just copy and paste should work.

Ideally we should be able to run your code and, maybe, duplicate the error.
Peopre are more likely to helf if they do not have to fiddle with data.

A handy way to supply sample data is to use the dput() function. See ?dput. If you have a very large dataset then something like head(dput(myfile), 100) will likely supply enough data for us to work with.

Hi again,
I already solved my problems.
Q1: When I use “combined.weights = T” I don’t get the warning message. Additionally, I compared the results with the output generated by SAS and both match. So, I feel comfortable the right way to do is to specify T.
Q2: I coded “repweights = "BSW[1-9]+" and it worked
Q3: I was able to get what I wanted as follows:
FreqProv = as.data.frame(svymean(~factor(PROVINCE), CCHSDesign))
Thank you very much,
AG.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.