Anesrake error: "Error in targetvec - dat: non-numeric argument to binary operator"

mrlazerbeam · February 5, 2021, 12:47pm

First time posting - and very new user of R Studio here. I've made a reprex below of an error i keep getting.

I am trying to rake survey data to make it representative of UK population data. I have a set of responses and have found distribution data to make it representative. The raking procedure I am using is anesrake , similar to this: R anesrake error: “Error in x + weights: non-numeric argument for binary operator” Ask Question.

I have 4 columns of data with 30 rows: caseid, gender (M,F); age (18-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75) and Answer (1,0). My data is saved on shared drive, havent included in the reprex.

I have loaded target distributions for age and gender, but continue to get the following error: "Error in targetvec - dat: non-numeric argument to binary operator"

I have attempted to include a reprex below.

Any assistance would really help!

Thanks

Ed

library(readxl)
library(weights)
library(anesrake)
library(plyr)
library(dplyr)
library(reshape2)
library(reprex)
rtest <- read_excel(data is not shared)
View(rtest) 
# UK Census 2019
gender  <- c(.51,.49)
age <- c(0.1,.17,.163,0.166,0.161,0.128,0.112)
# definitions of target list
targets <- list(gender, age)
#important to use same variable names of the dataset
names(targets) <- c("gender", "age")
#label levels of targets#
names(targets$gender) <- levels(rtest$gender)
names(targets$age) <- levels(rtest$age)
# change table type
rtest <- as.data.frame(rtest)
class(rtest)
#> [1] "data.frame"
#measure variance in population vs sample
anesrakefinder(targets, rtest, choosemethod = "total")
#>    gender       age 
#> 0.1533333 0.4053333
#convert to factor (unsure if this works)
rtest$gender <- as.factor(rtest$gender) 
rtest$age <- as.factor(rtest$age)
#raking procedure
raking <- anesrake(targets, rtest, rtest$caseid, cap = 5,choosemethod = "total", type = "pctlim",pctlim = 0.05)
#> Error in targetvec - dat: non-numeric argument to binary operator

^{Created on 2021-02-05 by the reprex package (v1.0.0)}

technocrat · February 7, 2021, 8:12am

No representative data means the issue has to be reverse engineered, which is a deterrent to helpful answers. It doesn't have to be all the data or even your data, just enough data in the same form to reproduce the issue.

See the FAQ: How to do a minimal reproducible example reprex for beginners

mrlazerbeam · February 8, 2021, 11:44am

Thanks for the guidance - I have now included sample data, and have been able to progress with the issue further by renaming Age categories as 1 - 7, and Gender as 1 - 2

I now get the following NA issue in the tables that are produced. Any thoughts?
I'd also like to not have to rename categorical variables into numbers for this to work. Not sure how?

library(readxl)
library(weights)
library(anesrake)
library(plyr)
library(dplyr)
library(reshape2)
library(reprex)
library(datapasta)
#Example data#
head(rtest2,10)
#> # A tibble: 10 x 4
#>    caseid gender   age Answer
#>     <dbl>  <dbl> <dbl>  <dbl>
#>  1      1      1     1      1
#>  2      2      2     2      0
#>  3      3      1     1      1
#>  4      4      1     1      1
#>  5      5      1     4      1
#>  6      6      1     1      0
#>  7      7      2     2      0
#>  8      8      2     1      1
#>  9      9      1     1      0
#> 10     10      1     4      0
datapasta::df_paste(head(rtest2,10))
#> data.frame(
#>       caseid = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
#>       gender = c(1, 2, 1, 1, 1, 1, 2, 2, 1, 1),
#>          age = c(1, 2, 1, 1, 4, 1, 2, 1, 1, 4),
#>       Answer = c(1, 0, 1, 1, 1, 0, 0, 1, 0, 0)
#> )
data.frame(
      caseid = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
      gender = c(1, 2, 1, 1, 1, 1, 2, 2, 1, 1),
         age = c(1, 2, 1, 1, 4, 1, 2, 1, 1, 4),
      Answer = c(1, 0, 1, 1, 1, 0, 0, 1, 0, 0)
                              )#Proportions of rtest2 survey#
#>    caseid gender age Answer
#> 1       1      1   1      1
#> 2       2      2   2      0
#> 3       3      1   1      1
#> 4       4      1   1      1
#> 5       5      1   4      1
#> 6       6      1   1      0
#> 7       7      2   2      0
#> 8       8      2   1      1
#> 9       9      1   1      0
#> 10     10      1   4      0
wpct(rtest2$gender)
#>    1    2 
#> 0.65 0.35
wpct(rtest2$age)
#>     1     2     3     4     5     6     7 
#> 0.274 0.246 0.114 0.096 0.076 0.102 0.092
wpct(rtest2$Answer)
#>     0     1 
#> 0.516 0.484
# UK Census 2019
gender  <- c(.51,.49)
age <- c(0.1,.17,.163,0.166,0.161,0.128,0.112)
# definitions of target list
targets <- list(gender, age)
#important to use same variable names of the dataset
names(targets) <- c("gender", "age")
#id variable
rtest2$caseid <- 1:length(rtest2$gender)
#label levels of targets#
names(targets$gender) <- levels(rtest2$gender)
names(targets$age) <- levels(rtest2$age)
# change table type
rtest2 <- as.data.frame(rtest2)
class(rtest2)
#> [1] "data.frame"
#measure variance in population vs sample
anesrakefinder(targets, rtest2, choosemethod = "total")
#> gender    age 
#>   0.28   0.50
#raking procedure
raking <- anesrake(targets, rtest2, caseid = rtest2$caseid, verbose = FALSE, cap = 5,choosemethod = "total", type = "pctlim",pctlim = 0.05, nlim = 5, iterate = TRUE, force1 = TRUE)
#> [1] "Raking converged in 11 iterations"
raking_summary <- summary(raking)
rtest2$weight <- raking$weightvec
#to find the unique weights#
rtest2 %>% select(gender, age) %>% unique()
#>    gender age
#> 1       1   1
#> 2       2   2
#> 5       1   4
#> 8       2   1
#> 12      1   2
#> 14      2   6
#> 15      1   7
#> 16      2   7
#> 17      1   3
#> 22      1   6
#> 35      1   5
#> 39      2   5
#> 40      2   3
#> 62      2   4
wpct(rtest2$Answer)
#>     0     1 
#> 0.516 0.484
wpct(rtest2$Answer, rtest2$weight)
#>         0         1 
#> 0.5120287 0.4879713
#Raking summary statistics do not work "NA"#
raking_summary$raking.variables
#> [1] "gender" "age"
raking_summary$gender
#>       Target Unweighted N Unweighted % Wtd N Wtd % Change in % Resid. Disc.
#> <NA>    0.51           NA           NA    NA    NA          NA           NA
#> <NA>    0.49           NA           NA    NA    NA          NA           NA
#> Total   1.00            0            0     0     0           0            0
#>       Orig. Disc.
#> <NA>           NA
#> <NA>           NA
#> Total           0
raking_summary$age
#>       Target Unweighted N Unweighted % Wtd N Wtd % Change in % Resid. Disc.
#> <NA>   0.100           NA           NA    NA    NA          NA           NA
#> <NA>   0.170           NA           NA    NA    NA          NA           NA
#> <NA>   0.163           NA           NA    NA    NA          NA           NA
#> <NA>   0.166           NA           NA    NA    NA          NA           NA
#> <NA>   0.161           NA           NA    NA    NA          NA           NA
#> <NA>   0.128           NA           NA    NA    NA          NA           NA
#> <NA>   0.112           NA           NA    NA    NA          NA           NA
#> Total  1.000            0            0     0     0           0            0
#>       Orig. Disc.
#> <NA>           NA
#> <NA>           NA
#> <NA>           NA
#> <NA>           NA
#> <NA>           NA
#> <NA>           NA
#> <NA>           NA
#> Total           0

^{Created on 2021-02-08 by the reprex package (v1.0.0)}

technocrat · February 12, 2021, 7:56am

Sorry for the delay. I'm going to have to study this package. DM me if you don't hear in a reasonable time.

system · March 5, 2021, 7:56am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.