I hate to ask such an extremely simple question but here it is. I've got a time-by-group dataset is "long" format. Within each of the two groups, the N's decline over time due to follow-up attrition. When putting together my descriptive stats I want to express the N at each time (within each group) as a percentage of that group's N at the first time point.
Doing it is bog simple by sticking a filter() function inside a mutate(). I guess I have a habit of staying in my Tidyverse comfort zone even when doing trivial tasks. Can anyone point out an equally simple (or simpler) way of computing a percentage-of-baseline-n quantity in plain old R code?
library(tidyverse)
# Create a fake dataset (3 times by 2 groups) of descriptive stats
df <- data.frame(time=rep(0:2,each=2),
grp=rep(0:1,3),
n=c(200,190,180,175,150,150),
p=c(0.1,0.2,0.15,0.26,0.21,0.32),
age=round(rnorm(6,50,6),1))
# Add a variable reflecting loss to followup at t1, t2 (percentage of n from t0)
newdf <- df %>% mutate(pct=100*n/filter(df,time==0,grp==grp)$n)
newdf
#> time grp n p age pct
#> 1 0 0 200 0.10 45.4 100.00000
#> 2 0 1 190 0.20 49.4 100.00000
#> 3 1 0 180 0.15 39.0 90.00000
#> 4 1 1 175 0.26 63.3 92.10526
#> 5 2 0 150 0.21 46.1 75.00000
#> 6 2 1 150 0.32 47.5 78.94737
Created on 2019-01-28 by the reprex package (v0.2.1)