Subsetting two binary variables?

grantfel10 · December 7, 2020, 12:49am

Hi I'm pretty new to R and I'm working on a project for my first semester in grad school. I'm using European Parliament voting percentage as the dependent variable, indices and socioeconomic variables for the IVs. My research question is seeking to explain political movement away from the center in Europe. To do so, I'd like to explain the EP voting percentage of populist and eurosceptic groups, which are coded 1 or 0 from a dataset.

There are 134 observations of populist parties over 3 elections and 173 of eurosceptic parties. I would like to subset the populist or the eurosceptic parties. There is often overlap between the two, but I believe there are about 20 or so populist parties that are not eurosceptic, so I should have 193ish observations. But whenever I try to subset the data using: EU_NEW <- subset(EU_NEW, POPULIST == 1 | EUROSCEPTIC == 1) the new dataset only shows when both eurosceptic and populist = 1 (134 obsvs) , not the two variables' observations combined. I'm not sure if I worded this question well, so please ask any questions that might help, but what I would like to do is subset both so I can have a larger DV.

Thanks!

AlexisW · December 7, 2020, 2:21am

I'm not sure how this is possible. Indeed, the approach you use should work, as can be made clear with an example:

df <- data.frame(id=1:10,
                 POPULIST = c(rep(1,4), rep(0,6)),
                 EUROSCEPTIC = c(rep(0,2),rep(1,4), rep(0,4)))


df
#>    id POPULIST EUROSCEPTIC
#> 1   1        1           0
#> 2   2        1           0
#> 3   3        1           1
#> 4   4        1           1
#> 5   5        0           1
#> 6   6        0           1
#> 7   7        0           0
#> 8   8        0           0
#> 9   9        0           0
#> 10 10        0           0
subset(df, POPULIST == 1 | EUROSCEPTIC == 1)
#>   id POPULIST EUROSCEPTIC
#> 1  1        1           0
#> 2  2        1           0
#> 3  3        1           1
#> 4  4        1           1
#> 5  5        0           1
#> 6  6        0           1

^{Created on 2020-12-06 by the reprex package (v0.3.0)}

So to understand what's wrong in your case, it could help if you made a minimal reproducible example, see some guidelines here:

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

PS: to get a better look at your data and see what number of parties to expect, you could try to tabulate POPULIST vs EUROSCEPTIC like that:

table(df$POPULIST, df$EUROSCEPTIC)
#>   
#>    0 1
#>  0 4 2
#>  1 2 2

system · December 28, 2020, 2:21am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.