I have these two given dataframes:
df_r_ranges = data.frame(
stringsAsFactors = FALSE,
GROUP_BY = c(NA, NA, NA, NA, NA),
R_Q = c("4", "1", "3", "2", "5"),
R_RANGE = c("591-1161","1882-2303",
"1162-1527","1528-1878","0-590"),
R_MIN = c(591, 1882, 1162, 1528, 0),
R_MAX = c(1161, 2303, 1527, 1878, 590)
)
df_r_values = data.frame(
stringsAsFactors = FALSE,
ID = c("1234567","2345678","3456789",
"4567890","7654321","8765432","9875643","0987654",
"1029384","2938475"),
GROUP_BY = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
R_VALUE = c(9L, 28L, 17L, 605L, 154L, 0L, 596L, 549L, 3L, 0L)
)
(Please ignore these ID
values :))
If the R_VALUE
is 600, then that ID
should be assigned R_Q = 4
.
And these ranges are different across different GROUP_BY
groups (but consistent within each GROUP_BY
groups), so they need to work with different GROUP_BY
groups as well.
So the end result I desire would be something along the lines of
data.frame(
stringsAsFactors = FALSE,
ID = c("1234567","2345678","3456789",
"4567890","7654321","8765432","9875643","0987654",
"1029384","2938475"),
R_VALUE = c(981L,931L,1074L,925L,981L,
94L,92L,84L,1001L,1074L),
R_Q = c(1, 1, 1, 1, 1, 4, 4, 4, 1, 1),
GROUP_BY = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)
)
First thing that came to my mind was to do a join, then filter. However, when doing a join, it would inflate the number of rows, which is not scalable. I'm unsure if there are other options.