The full code is provided at the bottom.
Let's assume we have a wide dataset of students as well as their scores in Maths, Biology, English and French.
# Load dplyr ----
library(dplyr)
# Create a sample dataset
set.seed(123)
tbl <- tibble(
student = LETTERS[1:5],
maths = sample(50:80, size = 5),
biology = sample(70:100, size = 5),
english = sample(70:90, size = 5),
french = sample(70:100, size = 5)
)
tbl
# A tibble: 5 × 5
student maths biology english french
<chr> <int> <int> <int> <int>
1 A 80 79 89 77
2 B 64 87 83 95
3 C 68 91 74 76
4 D 63 80 78 79
5 E 52 74 72 78
The school is offering some scholarships to students who have had a fairly good performance in various categories.
- A language scholarship is awarded to those with an 80 or above in English AND French
- First, we use the
filter()
function for the task.
- Second, we use the
if_all()
function because we need our condition to be met across all relevant columns, namely English AND French.
- This is why we specify
.cols = c("english", "french")
as the first argument of if_all()
- Finally, we write an anonymous function
.fns = ~ .x >= 80
, which illustrates the condition to get the scholarship
tbl %>%
filter(
if_all(.cols = c("english", "french"),
.fns = ~ .x >= 80)
)
# A tibble: 1 × 5
student maths biology english french
<chr> <int> <int> <int> <int>
1 B 64 87 83 95
As you can see only Student B is eligible for the language scholarship as he/she is the only one who obtained a score above 80 in both English and French.
- A science scholarship is awarded to those with an 80 or above in Maths AND Biology
tbl %>%
filter(
if_all(.cols = c("maths", "biology"),
.fns = ~ .x >= 80)
)
# A tibble: 0 × 5
# … with 5 variables: student <chr>, maths <int>, biology <int>, english <int>, french <int>
Unfortunately, no student qualifies for the science scholarship.
- A lower quality language scholarship is awarded to those with an 80 or above in English OR French
The main difference here is that we do not need the condition of obtaining a score of 80 or above to be met for both English and French. We only need the condition to be met for ONLY ONE of them. In such a case, the if_any()
function will do the job.
tbl %>%
filter(
if_any(.cols = c("english", "french"),
.fns = ~ .x >= 80)
)
# A tibble: 2 × 5
student maths biology english french
<chr> <int> <int> <int> <int>
1 A 80 79 89 77
2 B 64 87 83 95
Students A and B qualify for the lower quality language scholarship (we already saw that Student B gets the main scholarship anyway).
- A lower quality science scholarship is awarded to those with an 80 or above in Maths OR Biology
tbl %>%
filter(
if_any(.cols = c("maths", "biology"),
.fns = ~ .x >= 80)
)
# A tibble: 2 × 5
student maths biology english french
<chr> <int> <int> <int> <int>
1 A 80 79 89 77
2 B 64 87 83 95
I hope this short tutorial enables you to understand how to use if_any()
and if_all()
.
Cheers
Here is the full code:
# Load dplyr ----
library(dplyr)
# Create a sample dataset
set.seed(123)
tbl <- tibble(
student = LETTERS[1:5],
maths = sample(50:80, size = 5),
biology = sample(70:100, size = 5),
english = sample(70:90, size = 5),
french = sample(70:100, size = 5)
)
tbl
# A language scholarship is awarded to those with an 80 or above in English AND French
tbl %>%
filter(
if_all(.cols = c("english", "french"),
.fns = ~ .x >= 80)
)
# A lower quality language scholarship is awarded to those with an 80 or above in English OR French
tbl %>%
filter(
if_any(.cols = c("english", "french"),
.fns = ~ .x >= 80)
)
# A science scholarship is awarded to those with an 80 or above in Maths AND Biology
tbl %>%
filter(
if_all(.cols = c("maths", "biology"),
.fns = ~ .x >= 80)
)
# A lower quality science scholarship is awarded to those with an 80 or above in Maths AND Biology
tbl %>%
filter(
if_any(.cols = c("maths", "biology"),
.fns = ~ .x >= 80)
)