Dear R experts,
I have a data frame as bellow:
df=data.frame(cluster_id=c(1,1,2,2,3),
datetime0=c('2022-01-03','2022-02-03','2021-11-14','2022-01-18','2022-01-27'),
var1=c(10,0,0,0,9),
var2=c(10,0,0,0,9),
var3=c(0,1,0,3,9))
I used the following commands to summarize the data frame.
setDT(df)[,list(var1=n_distinct(ifelse(var1>0 , datetime0, NA), na.rm=T),
var2=n_distinct(ifelse(var2>0 , datetime0, NA), na.rm=T),
var3=n_distinct(ifelse(var3>0 , datetime0, NA), na.rm=T),
by=.(cluster_id)]
However, var1, var2, var3 could be a much longer list (e.g., var1- var20) and I wonder whether if there any way to have the same results without writing var*=n_distinct(ifelse(var*>0 , datetime0, NA), na.rm=T) for each of Var1-20.
Your suggestions will be appreciated.
Sincerely,
Veda