EC_2021_ASPR <- subset(EC,EC$year==2021 &
EC$location=="Global"&
EC$age_name=='Age-standardized' &
EC$sex_name=="Both"&
EC$metric_name== 'Rate' &
EC$measure_name=='Incidence')
EC_2021_ASPR$val <- round(EC_2021_ASPR$val,1)
EC_2021_ASPR$lower <- round(EC_2021_ASPR$lower,1)
EC_2021_ASPR$upper <- round(EC_2021_ASPR$upper,1)
EC_2021_ASPR$lower_to_upper <- paste(EC_2021_ASPR$lower,EC_2021_ASPR$upper,sep = ' to ')
EC_2021_ASPR$lower_to_upper <- paste('(',EC_2021_ASPR$lower_to_upper,')',sep = '')
EC_2021_ASPR$ASPR <- paste(EC_2021_ASPR$val,EC_2021_ASPR$lower_to_upper,sep = ' ')
print(EC_2021_ASPR)
You haven’t provided a data sample, so I can only base my answer on the code and assumptions about your dataset. (If you provide a data sample, we can test the code). It seems like you're working with a data frame (or tibble) called EC
and creating EC_2021_ASPR
as a filtered version.
The simplest change is just to stop selecting for year==2021
in the first line. Here, I've modified the code to say EC$year >= 1990 & EC$year <= 2021
. If the years 1990 through 2021 are the only years available in your sample, then this entire line may not be necessary at all - you could simply remove it.
ASPR_all_years <- subset(EC,
EC$year >= 1990 & EC$year <= 2021 &
EC$location == "Global" &
EC$age_name == "Age-standardized" &
EC$sex_name == "Both" &
EC$metric_name == "Rate" &
EC$measure_name == "Incidence")
But your code uses base R (not the newest conventions) and is maybe a bit messier than it could be - might I suggest a few changes?
Firstly, you can use the dplyr
package, which is very commonly used in R nowadays, instead of the base R functions you're using. This allows you to filter the data along all the same criteria without having to repeat the EC$
expression numerous times:
# Un-comment this line and run once if you don't already have dplyr installed
# install.packages("dplyr")
library(dplyr)
ASPR_all_years <- EC |>
filter(year %in% 1990:2021,
location == "Global",
age_name == "Age-standardized",
sex_name == "Both",
metric_name == "Rate",
measure_name == "Incidence")
The pipe symbol (|>
) lets you string together commands. Basically, this code says "Define ASPR_all_years as EC. Now take only the rows where the year falls in a range from 1990 to 2021 (inclusive), location is 'Global', ... etc."
The other advantage to using the dplyr
package is that you can avoid continuously redefining the same data frame. In the following lines of code, you continuously make incremental tweaks to EC_2021_ASPR
. This can get clunky. Instead, I'd suggest using the mutate
function, which is also from the dplyr
package:
ASPR_all_years <- EC |>
# Filter along needed criteria. Can remove the first line if 1990 - 2021 are the only years in the data
filter(
year %in% 1990:2021,
location == "Global",
age_name == "Age-standardized",
sex_name == "Both",
metric_name == "Rate",
measure_name == "Incidence"
) |>
# Use the mutate function to round val, lower, and upper and to create lower_to_upper string
mutate(
val = round(val, 1),
lower = round(lower, 1),
upper = round(upper, 1),
lower_to_upper = paste0(val, " (", lower, " to ", upper, ")")
)
Here we're saying "Define ASPR_all_years as EC. Now take only the rows where the year falls in a range from 1990 to 2021 (inclusive), location is 'Global', ... etc. Now redefine the val
, lower
, and upper
columns to be rounded to the tenths place and create a lower_to_upper
column which is a specially formatted string with the number followed by its range in parentheses."
Good luck!
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.