# R dplyr mutate with condition

Hi, experts! This is my first post so my apologies for a potentially naive question.
I have a variable (var) and I want to create a new factor indicating which quintile it belongs to by comparing its value with a set of quintile threasholds of a different variable (var2). The first step I did was to var a quintile rank based on var2's threasholds. Then how do I pass var2's threasholds to a new factor quintile? I tried mutate but I'm new to it. Not sure how to create a new variable with mutate based on conditions?

w <- ifelse((RIpr\$var>=RIpr\$var2q0) & (RIpr\$var<=RIpr\$var2q20), 1,
ifelse((RIpr\$var>=RIpr\$var2q20u) & (RIpr\$var<=RIpr\$var2q40), 2,
ifelse((RIpr\$var>=RIpr\$var240u) & (RIpr\$var<=RIpr\$var2q60), 3,
ifelse((RIpr\$var>=RIpr\$var2q60u) & (RIpr\$var<=RIpr\$var2q80), 4,5) )))

newdata<-data.frame(RIpr,w)

new<-newdata%>%
filter(w=="1")%>%
mutate(newquint = paste0("[", as.numeric(RIpr\$var2q0),as.numeric(RIpr\$var2q20),"]", sep=","))%>%
filter(w=="2")%>%
mutate(newquint = paste0("[", as.numeric(RIpr\$var2q20u),as.numeric(RIpr\$var2q40),"]", sep=","))%>%
filter(w=="3")%>%
mutate(newquint = paste0("[", as.numeric(RIpr\$var2q40u),as.numeric(RIpr\$var2q60),"]", sep=","))%>%
filter(w=="4")%>%
mutate(newquint = paste0("[", as.numeric(RIpr\$var2q60u),as.numeric(RIpr\$var2q80),"]", sep=","))%>%
filter(w=="5") %>%
mutate(newquint = paste0("[", as.numeric(RIpr\$var2q80u),as.numeric(RIpr\$var2q100),"]", sep=","))

It would help if you could provide sample data that would allow us to test out code tailored to your problem. For now, based on your description, it looks like you could do something equivalent to the code below (where I've used the built-in mtcars data frame for illustration). We categorize disp into quintilies, where the quintiles are constructed from hp. This can be done wiith the cut function and we use the quantile function to construct breaks representing the quintiles.

For future reference, for numeric variables, the cut function is generally much easier than nested ifelse statements. Also, in the code where you create new there are two other issues: First, filter removes rows that don't meet the condition, so, for example, filter(w=="1") removes all rows for which w does not equal "1". Then filter(w=="2") removes all of the remaining rows, since only rows with w=="1" are fed to that filter. Second, each mutate overwrites the previous version of newquint, so even if there were some data being fed into each mutate statement, the values in newquint would only be the output of the final mutate. To learn more about how these and other dplyr functions work, see chapter 5 of the free book R for Data Science.

library(tidyverse)

mtcars %>%
select(disp, hp) %>%
mutate(disp.by.hp.quintile = cut(disp, breaks=quantile(hp, prob=seq(0,1,0.2)), include.lowest=TRUE))
disp  hp disp.by.hp.quintile
Mazda RX4           160.0 110           (110,165]
Mazda RX4 Wag       160.0 110           (110,165]
Datsun 710          108.0  93          (93.4,110]
Hornet 4 Drive      258.0 110           (200,335]
Valiant             225.0 105           (200,335]
Duster 360          360.0 245                <NA>
Merc 240D           146.7  62           (110,165]
Merc 230            140.8  95           (110,165]
Merc 280            167.6 123           (165,200]
Merc 280C           167.6 123           (165,200]
Merc 450SE          275.8 180           (200,335]
Merc 450SL          275.8 180           (200,335]
Merc 450SLC         275.8 180           (200,335]
Lincoln Continental 460.0 215                <NA>
Chrysler Imperial   440.0 230                <NA>
Fiat 128             78.7  66           [52,93.4]
Honda Civic          75.7  52           [52,93.4]
Toyota Corolla       71.1  65           [52,93.4]
Toyota Corona       120.1  97           (110,165]
Dodge Challenger    318.0 150           (200,335]
AMC Javelin         304.0 150           (200,335]
Camaro Z28          350.0 245                <NA>
Pontiac Firebird    400.0 175                <NA>
Fiat X1-9            79.0  66           [52,93.4]
Porsche 914-2       120.3  91           (110,165]
Lotus Europa         95.1 113          (93.4,110]
Ford Pantera L      351.0 264                <NA>
Ferrari Dino        145.0 175           (110,165]
Maserati Bora       301.0 335           (200,335]
Volvo 142E          121.0 109           (110,165]

Thank you very much for your detailed explanation and illustration! That's very helpful! I appreciated your help!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.