I use a lot of the tidyverse with three-dimensional motion data, specifically virtual reality tracking data. With this comes data in the form position_x, position_y, and position_z or quaternion_w, quaternion_z, quaternion_y, quaternion_z that I usually put into three or four separate columns. The trouble comes when I want to do vector operations on these columns. For example, say I want to plot position of an object over time and have a little line segment specifying rotation. With vanilla tidyverse, the code would be something like this:
df %>%
mutate(
# set the direction of the vector in the local space
direction_local_x = 0,
direction_local_y = 0,
direction_local_z = 1,
# here we have messy code that rotates a vector by a quaternion,
# picture eight lines like this one
direction_worldspace_x = direction_local_x*quaternion_w-direction_local_y*quaternion_z+direction_local_z*quaternion_y
# then add the direction vector to the position vector
endpoint_x = position_x + direction_worldspace_x,
endpoint_y = position_y + direction_worldspace_y,
endpoint_z = position_z + direction_worldspace_z
) %>%
ggplot(...) # do ggplot magic here
Instead of that mess, I'd like to have the same exact behavior, but with much better syntactic sugar. It would be something like:
df %>%
mutate(
direction_local = make_3d(x=0, y=0, z=1),
direction_worldspace = rotate_3d(direction_initial, by=quaternion),
endpoint = add_3d(position, direction_worldspace)
) %>%
ggplot(...) # do ggplot magic here.
There are a few approaches in my mind, but I don't know which one to take.
Ideally, I'd like to use some premade functionality of mutate
so that my function make_3d
not only knows its own arguments (x, y, and z) but also that it's assigned the variable direction_local
and it's in the mutate function coming from the tibble df
. Then the function make_3d
can create the columns direction_local_x
, direction_local_y
, and direction_local_z
using rlang stuff, as long as make_3d
has the original tibble and the name "direction_local". I don't know if that's possible, and that's the main question in this post.
Another option is to override mutate to handle when the outermost funciton on the right is one of my special functions and call that function with the extra arguments I need. I'm uncomfortable overriding such an ubiquitous method, but that may be the right approach here.
A third option is to make my own function, e.g, mutate_3d
, that works like in case #2 but avoids overriding dplyr::mutate
. Then there would be some inelegant switching back and forth between mutate
and mutate_3d
that feels unneccesary.
Which one would be your choice?