I'm currently trying to calculate a weighted average using dplyr on a time series every 12 hours. I've writte code that seems to work properly for a normal arithmetic mean. Seen here:
The issue is that when you use summarise_ or mutate_ you cannot call variables by name in the functions. Does every group, FullDate, have the same length and weights? If so, you could save the weights in a vector and pass to the summarise_if function:
weight_vec <- # save the weights here
AverageMet12<-Hour12Met%>%
filter(LWC >= 0.05)%>%
collapse_by("12 hourly", Date = FullDate, start_date = "2014-05-31 17:59:00")%>%
group_by(FullDate)%>%
summarise_if(is.numeric, weighted.mean, w = weight_vec, na.rm = TRUE)
I appreciate the quick response. They all do have the same length within in the dataframe. However, when I run this code, I get this error when setting the weighted vector up outside the pipes like you did:
Error: Problem with summarise() input LWC.
x 'x' and 'w' must have the same length
I think this is due to the collapse_by function creating vector lengths that are different than the weight vector, but I could be wrong about that.
Ok that means that no, the FullDate groups do not have the same length. Do you need this to be generalized for any reason? For a generalized solution, I think that using something in the purrr package would be the best.
If you don't need it to be generalized, I would just type out the functions in a summarize:
I'll look more into solving this with purrr (something I haven't tried yet). However, the other solution you give here only gives me the weighted average for the entire data set, not for each group.
Ok, I suggest you look at the output of your code one "pipe" at a time. If the group_by(FullDate) produces only one row, then you only have one group in FullDate. What is the output of:
Woah, I'm a bit embarrased. I figured out the issue. R was thinking I was using the plyr summarise function rather than the dplyr summarise function. This code worked the way I was looking for.