Seeking help for using filter and mutate command to break down a current variable into separate variables

Hello,
My name is Allen and I am currently working on a project in R Studio. For the data set that I am working with, I have a variable called Abbreviated Stae name. This variable classifies states in my data set and allows for my data to be analyzed on a sate-by-state basis. I am trying to use the mutate command, as well as filter, to break down this State variable into separate variables that group the states into four different regions, which are North, West, Midwest, and South. My code is below.... It returned with an error.

politics <- politics %>%
  group_by(`repordem`, `chamber`) %>%
  select(`Party Affiliation`, chamber, `Ideology Based on Roll Call Votes`, repordem, chambermean, extreme, Year, `Abbreviated Stae name`, `% high sch`, `% age 65`, `% black`, `% farmer`,`% on finance`, `% gvtwrkr`,`% manuf`,`% urban`,`% veterans`,`% unemployed`,`% latino`)
politics <- mutate(politics, 
            North =((`Abbreviated Stae name` == "CT")|(`Abbreviated Stae name` == "ME")|(`Abbreviated Stae name` == "MA")|(`Abbreviated Stae name` == "NH")|(`Abbreviated Stae name` == "RI")|(`Abbreviated Stae name` == "VT")|(`Abbreviated Stae name` == "NJ")|(`Abbreviated Stae name` == "NY")|(`Abbreviated Stae name` == "PA"))
            summary(politics$North)
politics <- politics %>%
  group_by(`repordem`, `chamber`) %>%
  select(`Party Affiliation`, chamber, `Ideology Based on Roll Call Votes`, repordem, chambermean, extreme, Year, `Abbreviated Stae name`, `% high sch`, `% age 65`, `% black`, `% farmer`,`% on finance`, `% gvtwrkr`,`% manuf`,`% urban`,`% veterans`,`% unemployed`,`% latino`)
politics <- mutate(politics, 
            Midwest =((`Abbreviated Stae name` == "IL")|(`Abbreviated Stae name` == "IN")|(`Abbreviated Stae name` == "MI")|(`Abbreviated Stae name` == "OH")|(`Abbreviated Stae name` == "WI")|(`Abbreviated Stae name` == "IA")|(`Abbreviated Stae name` == "KS")|(`Abbreviated Stae name` == "MN")|(`Abbreviated Stae name` == "MS")|(`Abbreviated Stae name` == "NE")|(`Abbreviated Stae name` == "ND")|(`Abbreviated Stae name` == "SD"))
politics <- politics %>%
  group_by(`repordem`, `chamber`) %>%
  select(`Party Affiliation`, chamber, `Ideology Based on Roll Call Votes`, repordem, chambermean, extreme, Year, `Abbreviated Stae name`, `% high sch`, `% age 65`, `% black`, `% farmer`,`% on finance`, `% gvtwrkr`,`% manuf`,`% urban`,`% veterans`,`% unemployed`,`% latino`)
politics <- mutate(politics, 
           South=((`Abbreviated Stae name` == "DE")|(`Abbreviated Stae name` == "FL")|(`Abbreviated Stae name` == "GA")|(`Abbreviated Stae name` == "MD")|(`Abbreviated Stae name` == "NC")|(`Abbreviated Stae name` == "SC")|(`Abbreviated Stae name` == "VA")|(`Abbreviated Stae name` == "WV")|(`Abbreviated Stae name` == "AL")|(`Abbreviated Stae name` == "KY")|(`Abbreviated Stae name` == "MI")|(`Abbreviated Stae name` == "TN")|(`Abbreviated Stae name` == "AK")|(`Abbreviated Stae name` == "LA")|(`Abbreviated Stae name` == "OK")|(`Abbreviated Stae name` == "TX"))
politics <- politics %>%
  group_by(`repordem`, `chamber`) %>%
  select(`Party Affiliation`, chamber, `Ideology Based on Roll Call Votes`, repordem, chambermean, extreme, Year, `Abbreviated Stae name`, `% high sch`, `% age 65`, `% black`, `% farmer`,`% on finance`, `% gvtwrkr`,`% manuf`,`% urban`,`% veterans`,`% unemployed`,`% latino`) 
politics <- mutate(politics, 
        West=((`Abbreviated Stae name` == "CA")|(`Abbreviated Stae name` == "AZ")|(`Abbreviated Stae name` == "CO")|(`Abbreviated Stae name` == "ID")|(`Abbreviated Stae name` == "MT")|(`Abbreviated Stae name` == "NV")|(`Abbreviated Stae name` == "NM")|(`Abbreviated Stae name` == "UT")|(`Abbreviated Stae name` == "WY")|(`Abbreviated Stae name` == "AK")|(`Abbreviated Stae name` == "HI")|(`Abbreviated Stae name` == "OR")|(`Abbreviated Stae name` == "WA"))

The same error appeared for all of these. The error was :

Error: Incomplete expression: politics <- politics %>%
  group_by(`repordem`, `chamber`) %>%
  select(`Party Affiliation`, chamber, `Ideology Based on Roll Call Votes`, repordem, chambermean, extreme, Year, `Abbreviated Stae name`, `-30520gh sch`, ` 0x0.00000022f96p-1022ge 65`, `% black`, ` 0.000000armer`,`3553363500n finance`, ` 6.95305e-310vtwrkr`,`Successanuf`,`65536rban`,`% veterans`,`497937051nemployed`,` 0x0.07ffe915202cp-1022tino`) %>%
politics <- mutate(politics, 
        West=((`Abbreviated Stae name` == "CA")|(`Abbreviated Stae name` == "AZ")|(`Abbreviated Stae name` == "CO")|(`Abbreviated Stae name` == "ID")|(`Abbreviated Stae name` == "MT")|(`Abbreviated Stae name` == "NV")|(`Abbreviated Stae name` == "NM")|(`Abbreviated Stae name` == "UT")|(`Abbreviated Stae name` == "WY")|(`Abbreviated Stae name` == "AK")|(`Abbreviated Stae name` == "HI")|(`Abbreviated Stae name` == "OR")|(`Abbreviated Stae name` == "WA"))

The error that I received was::

Error: Incomplete expression: politics <- politics %>%
  group_by(`repordem`, `chamber`) %>%
  select(`Party Affiliation`, chamber, `Ideology Based on Roll Call Votes`, repordem, chambermean, extreme, Year, `Abbreviated Stae name`, `-30520gh sch`, ` 0x0.00000022f96p-1022ge 65`, `% black`, ` 0.000000armer`,`3553363500n finance`, ` 6.95305e-310vtwrkr`,`Successanuf`,`65536rban`,`% veterans`,`497937051nemployed`,` 0x0.07ffe915202cp-1022tino`) %>%
politics <- mutate(politics, 
        West=((`Abbreviated Stae name` == "CA")|(`Abbreviated Stae name` == "AZ")|(`Abbreviated Stae name` == "CO")|(`Abbreviated Stae name` == "ID")|(`Abbreviated Stae name` == "MT")|(`Abbreviated Stae name` == "NV")|(`Abbreviated Stae name` == "NM")|(`Abbreviated Stae name` == "UT")|(`Abbreviated Stae name` == "WY")|(`Abbreviated Stae name` == "AK")|(`Abbreviated Stae name` == "HI")|(`Abbreviated Stae name` == "OR")|(`Abbreviated Stae name` == "WA"))

Please help me with this! I am looking to use these new varibles to further anlayze my data set which surrounds political polarization! Thank You! I greatly appreciate it.
Sincerely,
Allen P.

Hi Allen, I would say that the first step you need to try to do is to put your data into reprex (reproducible example).

Another thing I would suggest that you shouldn't use group_by before you use mutate to create new columns.

Also, take a look at dplyr::case_when function. It'll probably be easier than what you are doing now with mutate.

Finally, you should try to isolate your error to as small of subset of variables as possible. For example, right now you use select to choose multiple columns and then you use mutate to create a column with multiple states. Will this error appear when you use just, e.g., 2 columns in select and only one state in mutate? If yes, then it'll be easier to troubleshoot what is going wrong.

1 Like

And fouthly use %in% instead of a chain of == expressions.

2 Likes

I second @mishabalyasin suggestion to create a minimal reproducible example, and his other suggestions. That will be very helpful.

Also note that the error message Error: Incomplete expression means your expression was incomplete.
If you were to run sum(1:10 (note the missing parenthesis) from a script you'd get this error.

I'm pretty sure your missing a parenthesis closing your call on mutate.

1 Like