tricky data re-ordering


I have some data I want to re-order but a few problems... here is how it currently looks

structure(list(county_fips_code = c(45001, 45001, 45001, 45001, 
45001, 45001, 45001, 45001, 45001, 45001), County = c("Abbeville County, SC", 
"Abbeville County, SC", "Abbeville County, SC", "Abbeville County, SC", 
"Abbeville County, SC", "Abbeville County, SC", "Abbeville County, SC", 
"Abbeville County, SC", "Abbeville County, SC", "Abbeville County, SC"
), Cluster = c("Aerospace Vehicles and Defense", "Agricultural Inputs and Services", 
"Apparel", "Automotive", "Biopharmaceuticals", "Business Services", 
"Coal Mining", "Communications Equipment and Services", "Construction Products and Services", 
"Distribution and Electronic Commerce"), strength = c(0, 0, 0, 
1, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame")

I want to eventually get it to the form of one observation per county with 50 columns per observation so it would look like

  County  County_fips Aerospace Apparel Agriculture
       a      123          1     0       1
       b      234          0     0       0

The tricky part is that some counties are missing the numeric id (fips code) and I've had problems in the past with R being able to distinguish text. Any tips?

Do you have counties where the FIPS code for that county is missing in every row for that county? Or do you have cases where FIPS code is missing for some rows for a given county, but not others? In that case, you can fill the missing values with the previous non-missing value. In either of the cases above, the code below should work (after filling NA values, if needed). Can you provide a data sample that has the missingness problems that affect your actual use case?


df %>% 
  pivot_wider(names_from=Cluster, values_from=strength)

awesome, this is great!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.