Hello, I am using the begin_date and end_date functions to filter my data but eventually they delete all the rows. I have tried to make sure that the end_date I specify is earlier than the earliest date in your dataset but this does not seem to help. Would anyone know what I can do to apply this subsets?
Below are the functions I am using:
Would you supply us with some sample data, please. The code looks like it should work. i suspect your date in the data.frame may not be in date format. Try a str(data) to see.
BTW, both data and date are functions in R. It is safer not to use them as variable names.
A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata) where mydata is your data. Copy the output and paste it here.
There are several issues here. The first one is, that there is no 31st of november, since november only has 30 days. Next, the function subset is from base R, so I don't get the point of including library(dplyr) and library(tidyr) in your question.
Hoewever, this is what I get with your code and everyhing works fine, so there might be an issue with your data, not your code:
> begin_date <- as.Date("2022-09-01")
> end_date <- as.Date("2022-11-30")
> data <- data.frame(
vals = letters[1:10],
date = as.Date(paste0("2022-",sample(3:12,10,TRUE),"-01"), format = "%Y-%m-%d")
)
> str(data)
'data.frame': 10 obs. of 2 variables:
$ vals: chr "a" "b" "c" "d" ...
$ date: Date, format: "2022-03-01" "2022-06-01" ...
data2 <- data |>
> max(data$date)
[1] "2022-12-01"
subset(`date` >= begin_date & `date` <= end_date>
> data2 <- data |>
subset(`date` >= begin_date & `date` <= end_date)
>
> data2
vals date
4 d 2022-10-01
5 e 2022-10-01