Hi All, I am fairly new to R and trying to produce a column 'Adjusted_Avg' where the column entries are based on conditions of values in a previous columns called 'Net' and 'Cumulative'.
Conditions (when grouped by ID):
(i): If Net != 0 & Cumulative = Net, Adjusted_Avg = Net
(ii): If Net != 0 and lag(Net) != 0, Adjusted_Avg = Net
(iii): If Net = 0, Adjusted_Avg = the next non-zero Net value divided by number of rows to reach that non-zero value.
This can be more easily understood by comparing df_net to df_spread_ideal
For context, this is the starting table 'df_net':
R Code for df_net table:
{r}
df_net <- structure(list(ID = c(14, 14, 14, 14, 23, 23, 36, 36, 36, 36,
112, 112, 112, 112, 112), month_year = c("04-2023", "05-2023",
"06-2023", "07-2023", "04-2023", "05-2023", "04-2023", "05-2023",
"06-2023", "07-2023", "03-2023", "04-2023", "05-2023", "06-2023",
"07-2023"), Cumulative = c(200, 200, 200, 1000, 400, 900, 300,
900, 900, 1600, 200, 200, 1000, 1000, 2000), Net = c(200, 0,
0, 800, 400, 500, 300, 600, 0, 700, 200, 0, 800, 0, 1000)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -15L), groups = structure(list(
ID = c(14, 23, 36, 112), .rows = structure(list(1:4, 5:6,
7:10, 11:15), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))
This is the table I want
And here is my code so far for trying to transform df_net into df_spread_ideal:
df_spread <- df_net %>%
group_by(ID) %>%
mutate(
Adjusted_Avg = case_when(
Net != 0 & Net == Cumulative ~ Net,
Net != 0 & Net != Cumulative & lag(Net) != 0 ~ Cumulative - lag(Cumulative),
TRUE ~ {
interval_count <- sum(Net == 0) + 1
Next_Net <- tail(Net[Net != 0], 1)
Next_Net / interval_count
}
)
)
I know my code is slightly flawed as it is taking the last non-zero 'Net' value for each ID rather than the next non-zero 'Net' value to work out the numerator and similarly it's working out how many 0 Net values there are for the entire ID rather than up until the point of the next non-zero net value.
Any help would be appreciated, thanks!