I'm fairly new to tidyverse. I have managed to get this code to work with a small example, but I'm sure there must be a better way. Here's the problem I'm trying to code.
I have 2 tibbles, tA and tB (my actual tibbles have 100,000+ rows)
tA <- tibble::tribble(
~name1, ~name2,
"k", "fw",
"k", "im",
"k", "fw",
"g", "fw",
"g", "im",
)
tB <- tibble::tribble(
~k_im, ~k_fw, ~g_im, ~g_fw,
0.031, 0.053, 0.000, 0.090,
0.209, 0.105, 0.000, 0.105,
0.274, 0.125, 0.158, 0.132,
0.331, 0.186, 0.199, 0.185,
0.344, 0.205, 0.201, 0.271,
0.367, 0.235, 0.272, 0.308,
0.382, 0.270, 0.295, 0.368,
0.390, 0.285, 0.299, 0.430,
0.397, 0.355, 0.348, 0.443,
0.484, 0.406, 0.419, 0.524,
0.532, 0.470, 0.430, 0.531,
0.557, 0.530, 0.468, 0.601,
0.609, 0.590, 0.530, 0.614,
0.646, 0.646, 0.593, 0.631,
0.712, 0.692, 0.644, 0.687,
0.730, 0.700, 0.652, 0.694,
0.793, 0.725, 0.706, 0.707,
0.845, 0.768, 0.766, 0.772,
0.862, 0.831, 0.814, 0.778,
0.886, 0.876, 0.863, 0.788,
0.887, 0.918, 0.896, 0.835,
1.000, 0.926, 0.918, 0.904,
1.000, 0.976, 0.969, 0.990,
1.000, 1.000, 1.000, 1.000
)
For every row in tA I need to generate a different random number between 0 and 1, then look up the appropriate variable in tB (so for row 1 in tA, that would be column k_fw in tB) and find the first row where the value in tB is greater than the random number.
This function does what I need, e.g. findX(tB, "k_im", 0.5) returns 11.
findX <- function(x,y,z){ match(TRUE,x[y]>z) }
This gives the right output but involves inserting tB into every row. I think I must be missing something!
output <- tA %>%
mutate(name3 = paste0(name1,"_",name2)) %>%
mutate(rnm = runif(5)) %>%
mutate(xx = list(tB)) %>%
mutate(rr = unlist(purrr::pmap(list(xx, name3, rnm), findX)))
Any thoughts? I think I've been staring at this for so long a fresh approach might be needed. Thanks!