Hi there,
I am trying to convert a (long) list with postcodes and suburb names into a (wide) list where only each postcode appears once and where the postcode covers more than one suburb, to list the suburb names in another column.
data.frame(
stringsAsFactors = FALSE,
Postcode = c(4161L,4155L,4154L,4153L,
4159L,4157L,4158L,4160L,4160L,4163L,4160L,4165L,
4165L,4157L,4164L,4165L,4178L,4179L,4179L,4172L),
SA2.name = c("Alexandra Hills",
"Belmont - Gumdale","Belmont - Gumdale","Belmont - Gumdale",
"Birkdale","Capalaba","Thorneside","Wellington Point",
"Cleveland","Cleveland","Ormiston","Redland Bay",
"Sheldon - Mount Cotton","Sheldon - Mount Cotton","Thornlands",
"Victoria Point","Brisbane Port - Lytton",
"Manly - Lota","Manly West","Murarrie")
)
In the above dataset, the postcode 4161 appears twice as it covers both Wellington Point and Cleveland.
I have tried the following code:
postcode_unique_suburb_names <- pivot_wider(postcode_pop_sub, names_from = SA2.name, values_from = Postcode)
#> Error in pivot_wider(postcode_pop_sub, names_from = SA2.name, values_from = Postcode): could not find function "pivot_wider"
Created on 2022-11-15 with reprex v2.0.2
This seems to be on the right track:
data.frame(
check.names = FALSE,
`Alexandra Hills` = c(4161),
`Belmont - Gumdale` = c(4155:4153),
Birkdale = c(4159),
Capalaba = c(4157),
Thorneside = c(4158),
`Wellington Point` = c(4160),
Cleveland = c(c(4160, 4163)),
Ormiston = c(4160),
`Redland Bay` = c(4165),
`Sheldon - Mount Cotton` = c(c(4165, 4157)),
Thornlands = c(4164),
`Victoria Point` = c(4165),
`Brisbane Port - Lytton` = c(4178),
`Manly - Lota` = c(4179),
`Manly West` = c(4179),
Murarrie = c(c(4172, 4174)),
Tingalpa = c(4173),
Wakerley = c(4154),
Wynnum = c(4178),
`Wynnum West - Hemmant` = c(c(4178, 4174)),
`Bald Hills` = c(4036),
`Bridgeman Downs` = c(4035),
Carseldine = c(4034),
`Everton Park` = c(4053),
McDowall = c(4053),
Aspley = c(4034),
Chermside = c(4032),
`Chermside West` = c(4032),
Geebung = c(4034),
`Kedron - Gordon Park` = c(c(4031, 4055)),
Stafford = c(4053),
`Stafford Heights` = c(4053),
`Wavell Heights` = c(4012),
Boondall = c(4034),
`Brisbane Airport` = c(4008),
`Eagle Farm - Pinkenba` = c(c(4008, 4007, 4009)),
`Northgate - Virginia` = c(c(9464, 4013, 4014)),
`Nudgee - Banyo` = c(4014),
Nundah = c(4012),
`Bracken Ridge` = c(4017),
`Brighton (Qld)` = c(4017),
Deagon = c(4017),
`Sandgate - Shorncliffe` = c(4017),
`Taigum - Fitzgibbon` = c(4018),
Zillmere = c(4034),
`Camp Hill` = c(4152),
`Cannon Hill` = c(4170),
Carina = c(4152),
`Carina Heights` = c(4152),
Carindale = c(4152),
Annerley = c(c(4103, 4105)),
Coorparoo = c(4151),
`Fairfield - Dutton Park` = c(4103:4102),
Greenslopes = c(c(4103, 4120)),
`Holland Park` = c(4121),
`Holland Park West` = c(4121),
Woolloongabba = c(4102),
Yeronga = c(4105:4104),
`Eight Mile Plains` = c(4113),
`Macgregor (Qld)` = c(4109),
`Mansfield (Qld)` = c(4122),
`Mount Gravatt` = c(4122:4121),
`Rochedale - Burbank` = c(c(4156, 4123)),
`Upper Mount Gravatt` = c(4122),
Wishart = c(4122),
`Rocklea - Acacia Ridge` = c(c(4108, 4110, 4106)),
`Coopers Plains` = c(4108),
Moorooka = c(4105),
Robertson = c(4109),
`Salisbury - Nathan` = c(c(4111, 4860, 4107)),
Tarragindi = c(4121),
Algester = c(4115),
`Calamvale - Stretton` = c(4116:4117),
`Pallara - Willawong` = c(4110),
`Parkinson - Drewvale` = c(4116:4115),
Kuraby = c(4112),
Runcorn = c(4113),
Sunnybank = c(4109),
`Sunnybank Hills` = c(4109),
`Jindalee - Mount Ommaney` = c(4074),
`Middle Park - Jamboree Heights` = c(4074),
Riverhills = c(4074),
`Seventeen Mile Rocks - Sinnamon Park` = c(4073),
Westlake = c(4074),
`Bellbowrie - Moggill` = c(4070),
`Brookfield - Kenmore Hills` = c(4069),
`Chapel Hill` = c(4069),
`Fig Tree Pocket` = c(4069),
Kenmore = c(4069),
`Pinjarra Hills - Pullenvale` = c(4069:4070),
`Chelmer - Graceville` = c(c(4075, 4068)),
Corinda = c(4075),
Indooroopilly = c(4068),
Sherwood = c(4075),
`St Lucia` = c(c(4067, 4072)),
Taringa = c(4068),
Enoggera = c(4051),
`Enoggera Reservoir` = c(4520),
Keperra = c(4054),
Mitchelton = c(4053),
`Mount Coot-tha` = c(4066),
`The Gap` = c(4061),
`Upper Kedron - Ferny Grove` = c(4055),
`Brisbane City` = c(c(4001, 4000, 4002, 4003, 9000, 9001, 9002, 9005, 9007, 9009, 9010, 9013, 9015),
NA),
`Fortitude Valley` = c(4006),
`Highgate Hill` = c(4101),
`New Farm` = c(4005),
`South Brisbane` = c(4101),
`Spring Hill` = c(c(4000, 4004)),
`West End` = c(4101),
Balmoral = c(4171),
Bulimba = c(4171),
`East Brisbane` = c(4169),
Hawthorne = c(4171),
`Morningside - Seven Hills` = c(4170),
`Norman Park` = c(4170),
Albion = c(4010),
Alderley = c(4051),
Ascot = c(4007),
Clayfield = c(4011),
Grange = c(4051),
`Hamilton (Qld)` = c(4007),
Hendra = c(4011),
`Kelvin Grove - Herston` = c(c(4059, 4006, 4029)),
Newmarket = c(4051),
`Newstead - Bowen Hills` = c(4006:4005),
Wilston = c(4051),
Windsor = c(4030),
`Wooloowin - Lutwyche` = c(4030),
Ashgrove = c(4060),
Auchenflower = c(4066),
Bardon = c(4065:4066),
`Paddington - Milton` = c(4064),
`Red Hill (Qld)` = c(4059),
Toowong = c(4066),
Brinsmead = c(4870),
`Clifton Beach - Kewarra Beach` = c(c(4879, 4871)),
`Freshwater - Stratford` = c(4870),
Redlynch = c(c(4870, 4878)),
`Trinity Beach - Smithfield` = c(4879:4878),
`Yorkeys Knob - Machans Beach` = c(4878),
`Bentley Park` = c(4869),
`Cairns City` = c(c(4870, 4872)),
`Earlville - Bayview Heights` = c(c(4870, 4868)),
Edmonton = c(4869),
`Gordonvale - Trinity` = c(c(4865, 4871, 4869)),
`Kanimbla - Mooroobool` = c(4870),
`Lamb Range` = c(4870),
Manoora = c(4870),
Manunda = c(4870),
`Mount Sheridan` = c(4868),
`Westcourt - Bungalow` = c(4870),
`White Rock` = c(4868),
`Whitfield - Edge Hill` = c(4870),
Woree = c(4868),
Babinda = c(c(4860, 4871, 4861)),
Innisfail = c(4860),
Johnstone = c(c(4859, 4852, 4858, 4871, 4860, 4855, 4856, 4857),
NA),
Tully = c(c(4854, 4849, 4852, 4816)),
Wooroonooran = c(4860),
Yarrabah = c(4871),
Daintree = c(c(4873, 4871)),
`Port Douglas` = c(c(4877, 4871, 4873)),
Atherton = c(c(4883, 4872, 4882)),
Herberton = c(c(4888, 4872, 4887)),
Kuranda = c(c(4872, 4881, 4871)),
`Malanda - Yungaburra` = c(c(4860, 4886, 4885, 4884, 4872, 4883),
NA),
Mareeba = c(4880),
Balonne = c(c(4487, 4486, 4488, 4497)),
Chinchilla = c(4413:4412),
Goondiwindi = c(4390),
`Inglewood - Waggamba` = c(c(4357, 4387, 4496, 4497, 4390, 4498, 4494, 4388, 4352, 4384, 4385),
NA),
`Miles - Wandoan` = c(c(4418, 4424, 4425, 4415, 4419, 4427, 4416),
NA),
Roma = c(4455),
`Roma Surrounds` = c(c(4455, 4461, 4426, 4427, 4428, 4417, 4423, 4465, 4487, 4462, 4467, 4454),
NA),
Tara = c(c(4421, 4413, 4405, 4406, 4422, 4423),
NA),
`Crows Nest - Rosalie` = c(c(4614, 4355, 4352, 4306, 4314, 4353, 4402, 4354, 4401, 4403),
NA),
Jondaryan = c(c(4401, 4350, 4352, 4356, 4403, 4404, 4405, 4407),
NA),
Millmerran = c(c(4357, 4352, 4407)),
Pittsworth = c(c(4356, 4363, 4350, 4352, 4364)),
Wambo = c(c(4405, 4411, 4406, 4408, 4410)),
`Clifton - Greenmount` = c(c(4359, 4361, 4360, 4352, 4358)),
`Southern Downs - East` = c(c(4370, 4371, 4373, 4372)),
`Southern Downs - West` = c(c(4362, 4361, 4370, 4374, 4352, 4365),
NA),
Stanthorpe = c(4380),
`Stanthorpe Surrounds` = c(c(4352, 4380, 4381, 4375, 4382, 4383, 4376, 4377, 4378),
NA),
Warwick = c(4370),
`Central Highlands - East` = c(c(4712, 4702, 4718, 4713, 4717)),
`Central Highlands - West` = c(c(4702, 4722, 4721, 4723, 4720, 4709),
NA),
Emerald = c(4720),
`Agnes Water - Miriam Vale` = c(c(4678, 4677, 4674, 4676)),
Banana = c(c(4702, 4420, 4419, 4719, 4718, 4715, 4716),
NA),
Biloela = c(4715),
`Boyne Island - Tannum Sands` = c(4680),
When I attempt to join this data frame with another (that has total population per postcode) using this code:
joined_postcode_sub_pop <- inner_join(postcode_unique_suburb_names, postcode_pop, by = "Postcode")
#> Error in inner_join(postcode_unique_suburb_names, postcode_pop, by = "Postcode"): could not find function "inner_join"
Created on 2022-11-15 with reprex v2.0.2
I get the following error message:
Error in
inner_join()
:
! Join columns must be present in data.
Problem withPostcode
.
Runrlang::last_error()
to see where the error occurred.
Boyne Island - Tannum Sands
= c(4680),
Error: unexpected ',' in "Boyne Island - Tannum Sands
= c(4680),"
I've checked both data sets and Boyne Island and 4680 exist in both. The rlang::last_error() states:
<error/rlang_error>
Error in `inner_join()`:
! Join columns must be present in data.
✖ Problem with `Postcode`.
---
Backtrace:
1. dplyr::inner_join(...)
2. dplyr:::inner_join.data.frame(postcode_unique_suburb_names, postcode_pop, by = "Postcode")
Run `rlang::last_trace()` to see the full context.
The postcode_pop dataframe is:
check.names = FALSE,
Postcode = c(4000L,4001L,4002L,4003L,
4004L,4005L,4006L,4007L,4008L,4009L,4010L,4011L,
4012L,4013L,4014L,4017L,4018L,4019L,4020L,4021L),
`sum(Population)` = c(93931L,43756L,21878L,
10939L,6419L,25019L,82679L,31752L,1288L,1704L,11896L,
26789L,46214L,13738L,50730L,81535L,25238L,81964L,
21014L,17717L)
)
Thank you, in advance, for any help or advice you can provide.