Joining Datasets - what am I doing wrong?

Hi all,

I'm trying to join two datasets from Public Health Scotland

There's a large dataset of hospital admissions here
and a smaller dataset of patient deaths here. I want to join the two so I can compare deaths as a proportion of admissions over time. However, I can't work out how to do this without causing lots of duplication of rows. As all deaths are a subset of admissions, I thought left joining deaths to admissions would be the most logical but that doesn't seem to work. I'd really welcome any assistance. Very many thanks.

Code is


admissions <- read_csv("")
deaths <- read_csv("")

combined <- admissions %>% left_join(deaths)

knowing the contents of the data is generally going to be important.
you are using left_join without specificying the join varaibles to use.
There are many common variable names across the two.

 [1] "HBR"              "HBRQF"            "CA"               "CAQF"             "AgeGroup"        
 [6] "AgeGroupQF"       "Sex"              "SexQF"            "InjuryLocation"   "InjuryLocationQF"
[11] "InjuryType"       "InjuryTypeQF"  

Is there a variable that represents a unique patient ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.