The Problem:
Translating a for loop which uses the result of its previous iteration to calculate the current value.
Im calculating the current stock of medication a person has based on:
- The first prescription amount V_0
- The daily dose used (linear decrease per day).
- The stock is updated by new prescriptions V_1
- The amount is positive or 0 since you cannot have a negative ammount of medication
Calculation current medication stock
V_{current} = V_0 - 500_{mg} + V_1
with V_{current} \ge 0
Code
- Each row in the data represents one day, therefore with each iteration the daily dosis (e.g. 500mg) is substracted
- The loop starts at row 2 since row 1 contains the initial stock of medication.
roll_fun_long <- function(.,...){
for (i in 2:nrow(.)){
.[i, "vol_c"] <- ifelse((.[i-1, "vol_c"] - 500) < 0,
0 + .[i, "vol_c"],
.[i-1, "vol_c"] - 500+ .[i, "v_1"])
}
return(.)
}
Thus the loop references the last value it calculated (i.e. the stock of medication at t_{-1} to calulate the current amount of medication V_{current} at t_1.
Applying the function
- Splitting the data.frame along person identifiers
- Applying the loop
- Recombining into one dataframe
vol_long%<>% split(.$person) %>% map_df(roll_fun_long)
Details
I dont have an idea how to vectorise this function or put it into dplyr / tidyverse logic since all functions I know operate on vectors and you cannot reference intermediate row results. The process is fairly inefficient and not feasable for larger Datasets. I though about using data.table
because it supports modifying in place instead of rewriting the object hundrets of times.
Thanks for your suggestions