sryvyr package creating complex survey objects in R

I´m trying to use the srvyr package in R. The data is extracted from this link:

Expenses: https://cdn.bancentral.gov.do/documents/estadisticas/encuesta-de-gastos-e-ingresos/documents/Cuadros_Gastos.xlsx?v=1689283267553

Income: https://cdn.bancentral.gov.do/documents/estadisticas/encuesta-de-gastos-e-ingresos/documents/Cuadros_Ingresos.xlsx?v=1689283267553

Since this is a household survey, I use merge for this two datasets.

Ingresos <- read_excel("Sociodemograficas_e_Ingresos.xlsx", sheet ="Base")

gastos <- read_excel("Gasto_consumo_final_mensual.xlsx")

The expansion factor is supossed to give the population estimate which for the Dominican Republic = +10,000,000.

If I sum the FACTOR_EXPANSION varibles it´s exactly the amount needed. However, when I create my survey object, I don´t get the population estimates.

Ingresos_Filtrado <- Ingresos %>% 
      select (A204,A303,A302, A402, A404, A405, A410,GRUPO_RAMA, GRUPO_OCUPACION, 
GRUPO_CATEGORIA,GRUPO_EDAD, GRUPO_EDUCACION, GRUPO_SECTOR, GRUPO_EMPLEO, ESCOLARIDAD, 
SALARIO_PRINCIPAL, A201, A202A, A202B, A202C, A202D, A207, A208,  A212, A221, GRUPO_REGION, 
DES_PROVINCIA,DES_MUNICIPIO, A206, A213, A218, A219, A224,A303, A309, CALLES_ASFALTADAS,ESTRATO ,
ALUMBRADO_PUBLICO,FACTOR_EXPANSION, VIVIENDA, HOGAR, MIEMBRO, UPM,PET, PEA, QUINTIL, TRIMESTRE, REPLICA, ORDEN_REGION,A102,A401,A401A)


Union_Ingresos_Gastos <- merge(x=Ingresos_Filtrado,y=gastos,by=c("TRIMESTRE", "REPLICA", "UPM", "FACTOR_EXPANSION", "VIVIENDA", "HOGAR", "ORDEN_REGION", "QUINTIL"),all.x=TRUE)

survey_2 <- Union_Ingresos_Gastos %>% 
		as_survey_design(ids=UPM,strata=ESTRATO,weigths=FACTOR_EXPANSION,nest=TRUE)

survey_2  %>% group_by(QUINTIL) %>%
 	summarise(total = survey_total(A401A,level=0.95,na.rm=TRUE)) %>% mutate(Total = sum(total))

Result is:

# A tibble: 5 × 4
  QUINTIL   total total_se   Total
    <dbl>   <dbl>    <dbl>   <dbl>
1       1 2213294   82873. 8513968
2       2 2127726   80353. 8513968
3       3 1765902   65479. 8513968
4       4 1483914   71456. 8513968
5       5  923132   54371. 8513968

With this formula, the population estimate is 8,513,968

I need help, because when I use formulas without the survey object, I get more precise results.

sum(Ingresos$FACTOR_EXPANSION)
[1] 10,299,551

Is the problem merging the two datasets?

Perhaps I need another argument for the as_survey_design

Help!

Doing some research I believe the issue is with fpc.

However I don´t know how to calculate it, considering most household surveys would have this variable

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.