I have 2 CSV files: cancer admissions and cancer deaths.
File 1 = Cancer admissions file has lots of data (but the data of interest is types of cancer and the number of admissions for people with each type of cancer across a country).
File 2 = Cancer deaths file has lots of data (but the data of interest in the file is types of cancer and the number of deaths for people with each type of cancer across the country).
I have made 2 tables:
table 1 = cancer type in column 1 and number of admissions in column 2.
table 2 = cancer type in column 1 and number of deaths (for those admitted to hospital with that type of cancer) in column 2.
However, I am having difficulty calculating the percentage mortality for each cancer type when patients are admitted to hospital.
I think I need to somehow combine table 1 and table 2, so that I have 3 columns:
Cancer type in column 1
Number of admissions in column 2
Number of deaths (from those admitted) in column 3
I'd be really grateful for advice on how I can do this.
The second question is, once I have combined the 2 files to get the combined table abvoe (which I am not sure how to do), how do I use R coding to calculate the percentage of mortality when patients are admitted (for each cancer type)?
You have not provided any sample data to work with, so I invented some. I used functions from the dplyr package to join the data and make a new column. If you need more specific help, please provide samples of your data in a reproducible example as explained in the link below.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#invent data
Admit <- data.frame(cancer_type = c("A", "B", "C"), Admissions = c(123, 264, 97))
Deaths <- data.frame(cancer_type = c("B", "C", "A"), Died = c(57, 23, 60))
#Join the data
AllDat <- inner_join(Admit, Deaths, by = "cancer_type")
AllDat
#> cancer_type Admissions Died
#> 1 A 123 60
#> 2 B 264 57
#> 3 C 97 23
#Calculate the rate
AllDat <- AllDat %>% mutate(Rate = Died/Admissions)
AllDat
#> cancer_type Admissions Died Rate
#> 1 A 123 60 0.4878049
#> 2 B 264 57 0.2159091
#> 3 C 97 23 0.2371134