Hello, I'm trying to use the ifelse function in R to create a new column that groups data from 1 column into "Low", "Average" or "High" categories, but can't figure out how to handle 0's.
Here are the conditions:
-
If the values in column "q2" (the median of a year) is equal to or less than "zn_min" (the minimum of a zone, for all years), then it should be given the designation "Low" in the "group" column. For example, line 5 in the data should be in group "Low" because q2 = 0; which is "less than or equal to" the min column (zn_min) in that row.
-
If "q2" is equal to or between "zn_q3" and "zn_max" (the 3rd quartile and max of a zone , for all years), then it should be given the designation "High" in the "group" column.
-
If neither condition is met (meaning q2 is in-between "zn_q1" and "zn_q3"), then it should be "Ave.".
I think the problem occurs when all columns (except the max) equal zero, then the function can't differentiate between my "High" and "Low" condition. Any suggestions?
My attempt:
x <- median_by_year$q2
median_by_year$group <- ifelse(x >= median_by_year$zn_q3 & x <= median_by_year$zn_max, "High",
ifelse(x <= median_by_year$zn_min, "Low", "Ave."))
Data:
dput(median_by_year)
structure(list(CYR = structure(1:19, levels = c("2004", "2005",
"2006", "2007", "2008", "2009", "2010", "2011", "2012", "2013",
"2014", "2015", "2016", "2017", "2018", "2019", "2020", "2021",
"2022"), class = "factor"), Zone = c("Crocodile", "Crocodile",
"Crocodile", "Crocodile", "Crocodile", "Crocodile", "Crocodile",
"Crocodile", "Crocodile", "Crocodile", "Crocodile", "Crocodile",
"Crocodile", "Crocodile", "Crocodile", "Crocodile", "Crocodile",
"Crocodile", "Crocodile"), q2 = c(50%
= 0.1,50%
= 0.1,50%
= 0.05,
50%
= 0.05,50%
= 0,50%
= 0.05,50%
= 0,50%
= 0,
50%
= 0,50%
= 0,50%
= 0,50%
= 0,50%
= 0,50%
= 0,
50%
= 0.05,50%
= 0,50%
= 0,50%
= 0,50%
= 0), zn_min = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), zn_q1 = c(25%
= 0,
25%
= 0,25%
= 0,25%
= 0,25%
= 0,25%
= 0,25%
= 0,
25%
= 0,25%
= 0,25%
= 0,25%
= 0,25%
= 0,25%
= 0,
25%
= 0,25%
= 0,25%
= 0,25%
= 0,25%
= 0,25%
= 0
), zn_q3 = c(75%
= 0,75%
= 0,75%
= 0,75%
= 0,75%
= 0,
75%
= 0,75%
= 0,75%
= 0,75%
= 0,75%
= 0,75%
= 0,
75%
= 0,75%
= 0,75%
= 0,75%
= 0,75%
= 0,75%
= 0,
75%
= 0,75%
= 0), zn_max = c(0.4, 0.4, 0.4, 0.4, 0.4, 0.4,
0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4
), group = c(50%
= "High",50%
= "High",50%
= "High",
50%
= "High",50%
= "High",50%
= "High",50%
= "High",
50%
= "High",50%
= "High",50%
= "High",50%
= "High",
50%
= "High",50%
= "High",50%
= "High",50%
= "High",
50%
= "High",50%
= "High",50%
= "High",50%
= "High"
)), row.names = c(NA, -19L), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), groups = structure(list(CYR = structure(1:19, levels = c("2004",
"2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012",
"2013", "2014", "2015", "2016", "2017", "2018", "2019", "2020",
"2021", "2022"), class = "factor"), .rows = structure(list(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -19L), .drop = TRUE))