How to Filter Follow-Up Data Within ±120 Days of 1 Year from Baseline in R

Hi Everyone,

I am new to R. I am working with a dataset in R where each patient is identified by a unique subjectid. Each patient has a baseline_date and multiple follow-up dates (fu_date). I want to filter follow-up records for each subjectid that fall within ±120 days of exactly 1 year (365 days) from the baseline_date.

After filtering, I need to calculate the total exacerbations (fu_exacerbations) for each patient during this period, making this a new column. How can I implement this in R?

This is the [test_data](https://filebin.net/cx9mmxinegol7x3h).

Thanks a lot for your time!

I'm a little hesitant to pull the data from an unknown link, but I can still help. I'm going to assume that the you have a separate entry (row) for each follow-up date, although if they are in a string it is only another line or two of code to separate that. I would recommend using the tidyverse package, which makes this pretty straightforward. To understand the below, "%>%" is a piping operator that takes the output of the function on the left and puts that into the first argument of the function on the right, which for most (all?) of the functions in tidyverse is the source data.

library(tidyverse)

data <- read_csv("your data")

data_summary <- data %>%
    filter(baseline_date > fu_date + years(1) - days(120),
           baseline_date < fu_date + years(1) - days(120)) %>%
    group_by(subjectid) %>%
    summarize(total_exacerbations = sum(fu_exacerbations, na.rm = T))

This will yield a table that has a column of subjectid and a column containing total exacerbations for the selected range. Because I'm not interfacing with your data directly there might be a little more formatting that is needed (if you posted the result of dput(head([test_data], 10)) then I could add that bit of code).