I'm relatively new to R and I'm uncertain whether I've correctly constructed an aggregated rate.
My data set contain monthly number of suicides for six counties in a country. Four of these regions have received a treatment, while the remaining two are controls.
I want to present a descriptive trend in suicide rate aggregated to treatment level, i.e. summarizing county suicides over regions and construct a population adjusted rate of suicides per 100 000 by treatment status.
This is my code:
# 1. Group suicide rate per 100 000 by treatment/intervention
month_intervention <- GP %>%
group_by(time,intervention)%>%
summarize(
suicideRate=sum(dead/regpop*100000))
month_intervention
# 2. Make ggplot
SuiPlot<-ggplot(month_intervention,
aes(x=time,
y=suicideRate,
col=intervention
)) +
geom_smooth(se=FALSE) +
geom_point(alpha=.5) +
expand_limits(y=0)
SuiPlot
Here the aggregated suicide rate is constructed by,
summarize(
suicideRate=sum(dead/regpop*100000))
where dead is monthly number of suicides in individual counties and regpop is yearly registered population in individual counties (monthly data not available).
In order for us to be able to help you better, could you ask this with a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.
If you have problems including sample data in your reprex, you can check this blog post by Mara
I've tried to figure our reprex, but it seems like my time variable is causing some problems (made with the "zoo" package, as I have month-year data).
I've added the output from head(mydata) from a reduced data set containing the variables relevant to constructing the rate, i.e. time, intervention, dead, and regpop.
Hope this is helpfull, despite my improvement potential in R:
> library("datapasta")
> head(GP_subset_reduced)
# A tibble: 6 x 10
id region year intervention dead month regpop suiciderate post_2016 time
<dbl> <fct> <int> <fct> <int> <int> <dbl> <dbl> <fct> <S3: yearmon>
1 1 1 2012 0 3 1 844511 0.355 0 Jan 2012
2 2 1 2012 0 4 2 844511 0.474 0 Feb 2012
3 3 1 2012 0 17 3 844511 2.01 0 Mar 2012
4 4 1 2012 0 21 4 844511 2.49 0 Apr 2012
5 5 1 2012 0 12 5 844511 1.42 0 May 2012
6 6 1 2012 0 11 6 844511 1.30 0 Jun 2012
> options(dplyr.width = Inf)
> dpasta(GP_subset_reduced)
Error in strrep(" ", char_length - nchar(char_vec)) :
invalid 'times' value
In addition: Warning message:
In tribble_construct(input_table, oc = output_context) :
Column(s) 2,4,9 have been converted from factor to character in tribble output.