I know my question is unclear. So let me state my question clearly. My data set is a panel. I have a lot of person (id). And I also have a month variable (month).
Some people have the complete 1-12 month. However, some people only have part of months. My problem is that I only keep those who have complete 1-12 months.
Please be aware that a REPRoducible EXample (reprex) is much more useful than a textual description of your problem and it is also a polite thing to do when asking people to help you with coding questions.
Thanks. I upload my data. And I need to keep who (id) have complete 12 months in 2011. In my data set, each id has observations both in 2011 and in 2013. There's no restriction for year 2013, only strict restrictions for year 2011.
I did this to create 2 new variables: month and year. And the next thing I need to do is to keep only those id who have 12 months observations in year 2011. In other words, if id only has 11 months in year 2013, that's okay and should be kept as long as this id has 12 months in year 2011.
Here is split out the multipipe statement into single steps for you to analyse:
library(tidyverse)
library(lubridate)
newdata <- mutate(newdata, month = month(bill.date))
newdata <- mutate(newdata, year = year(bill.date))
a <- filter(
newdata,
year == 2011
)
a
b<- a%>%
select(-bill.date)
b
c <- b %>%
distinct
c
d <- c %>%
group_by(id)
d
e<- d %>%
count()
e
f <- e %>%
filter(n == 12)
f
g <-f %>%
select(id)
g
ids_with_12_in_2011_with_data <- left_join(
g,
newdata
)