I have some data that looks a bit like this:
df_start <- data.frame(time = c(1:10),
var1 = c(0, 0, 1, 1, 1, 1, 0, 0, 1, 0))
For my purposes, var1 contains 5 consecutive stretches of the same value: two 0s, four 1s, two 0s, one 1, 1 zero. I want to know the value of 'time' where each of these stretches starts and ends.
I want to create a new column, seq, which will tell me if a value in var1 is at the start or end of a sequence of 0s or 1s, i.e. if it is the first 0 or 1 of a stretch of 0s of 1s, it should have the value "start", if it's in the middle of a stretch, it should have the value "mid", and if it's at the end of a stretch, it should be "end". If it's the only 0 or only 1 in its stretch, it should have the value "solo".
Essentially, I'm trying to end up with a dataframe that looks like this:
df_aim <- data.frame(time = c(1:10),
var1 = c(0, 0, 1, 1, 1, 1, 0, 0, 1, 0),
seq = c("start", "end", "start", "mid", "mid", "end", "start", "end", "solo", "solo"))
I know I can use case_when() within mutate to create seq, and give it a value based on what's happening in var1. But is there a way to get it to look at what's happening in the row above or below to code the new column?
I'm imaging something like this:
#Not real code!!
df_start %>% mutate(seq = case_when(
row_above != current_row & row_below == current_row ~ "start",
row_above == current_row & row_below == current_row ~ "mid",
row_above == current_row & row_below != current_row ~ "end",
row_above != current_row & row_below != current_row ~ "solo"
)
)
But am not sure how to actually implement this in tidyverse as I'm not sure how I would get it to look at the row above or below each row. Any thoughts/suggestions?
Thanks in advance!