i have a binary column called "in_zoo". its true or false. i want to group by monkey and arrange by date and monkey. so lowest date on top for a given group monkey. for a given row, i want to do a cumulative sum(e.g.cumsum in R) to see how many rows PRIOR have a "TRUE" in "in_zoo" by creating new variable called "past_zoo". i do not want to include the current row's value of "zoo" in the creation of "past_zoo" . I need to ensure the new indicator does not include the current row value.
this is what I have so far but I know its not right:
data %>%
group_by(monkey) %>%
arrange(monkey, date) %>%
mutate(past_zoo = lag(cumsum(in_zoo), default = FALSE)) %>%
ungroup()
any help would be so appreciated, thank you all.
for the output I need , see example see here:
monkey date in_zoo past_zoo
Adam 1/1/2010 TRUE 0
Adam 1/19/2010 FALSE 1
Adam 1/25/2010 TRUE 1
Adam 1/31/2010 TRUE 2
Adam 2/1/2010 FALSE 3
note the first row "past_zoo" value MUST always be 0 for a given monkey and their earliest date.
Thank u all so much