I have a large data set that represents the month a given person received unemployment help counting months from 1 to 24 (the data set represents two years). The same person can therefore appear 24 times if the person received unemployment help for 24 months (two years). However a person can also have received unemployment help the first six months and then having a job for a year for then agian to return to unemployment help. The person would then appear 12 times but with a break in between. The dataset is ordered so the numbers appear from minimum to maximum.
I would like to make a dummy that gives the person '1' if the person received unemployment help for 12 months in a row during these two years and 0 if the person has received unemployment help for less than that or with breaks in between, so maybe the person received unemployment help during month 1-6 and then again from 22-23.
Hi @frederikke. You can combine rle and diff to find the consecutive number. The diff give out the lagged differences between two numbers and rle count the length of same lagged differences (in case of consecutive number, lagged difference will be 1).
So sorry that i haven't replied sooner, but I just now had the time to try out your solution - and it worked! So thank you so much, it was reaaly useful!
If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it: