This is a follow up on this question. Now I want to do a few more steps:
-
Group by columns
ID
andorder
-
For every
val
indf_dat
, look up the correspondingratio
in thedf_lookup
table with the following conditions:o If
val < min(df_lookup$val)
, setnew_ratio = min(df_lookup$ratio)
o Ifval > max(df_lookup$val)
, setnew_ratio = max(df_lookup$ratio)
o Ifval
falls withindf_lookup$val range
, do a simple linear interpolation
library(dplyr)
df_lookup <- tribble(
~ID, ~order, ~pct, ~val, ~ratio,
"batch1", 1, 1, 1, 0.2,
"batch1", 1, 10, 8, 0.5,
"batch1", 1, 25, 25, 1.2,
"batch2", 2, 1, 2, 0.1,
"batch2", 2, 10, 15, 0.75,
"batch2", 2, 25, 33, 1.5,
"batch2", 2, 50, 55, 3.2,
)
df_dat <- tribble(
~order, ~ID, ~val,
1, "batch1", 0.1,
1, "batch1", 30,
1, "batch1", 2,
1, "batch1", 12,
2, "batch1", 45,
2, "batch2", 1.5,
2, "batch2", 30,
2, "batch2", 13,
2, "batch2", 60,
)
I have a working data.table
code. Is it possible to achieve the same goal with tidyverse
's verbs?
library(data.table)
setDT(df_lookup)
setDT(df_dat)
df_lookup[, m := (ratio - shift(ratio, -1L)) / (val - shift(val, -1L))]
df_dat[, new_ratio :=
df_lookup[.SD, on=.(order, ID, val), roll=Inf, rollends=c(FALSE, FALSE),
x.m * (i.val - x.val) + x.ratio]
]
df_dat[is.na(new_ratio), new_ratio :=
df_lookup[copy(.SD), on=.(order, ID, val), roll=Inf, x.ratio]]
df_dat[is.na(new_ratio), new_ratio :=
df_lookup[copy(.SD), on=.(order, ID, val), roll=-Inf, x.ratio]]
order ID val new_ratio
1 1 batch1 0.1 0.2
2 1 batch1 30 1.2
3 1 batch1 2 0.243
4 1 batch1 12 0.643
5 2 batch1 45 NA
6 2 batch2 1.5 0.1
7 2 batch2 30 1.38
8 2 batch2 13 0.65
9 2 batch2 60 3.2
Example:
- For
order = 1
,ID = batch2
andval = 30
,new_ratio
=1.2
(maxratio
value). - For
order = 1
,ID = batch1
andval = 2
,new_ratio = 0.243
which is the interpolatedratio
value between 0.2 and 0.5. - For
order = 2
andID = batch1
,new_ratio
= NA as those conditions aren’t in the lookup table.
Any help appreciated!