tricky data re-ordering

Erica · April 20, 2020, 10:49pm

Hi,

I have some data I want to re-order but a few problems... here is how it currently looks


structure(list(county_fips_code = c(45001, 45001, 45001, 45001, 
45001, 45001, 45001, 45001, 45001, 45001), County = c("Abbeville County, SC", 
"Abbeville County, SC", "Abbeville County, SC", "Abbeville County, SC", 
"Abbeville County, SC", "Abbeville County, SC", "Abbeville County, SC", 
"Abbeville County, SC", "Abbeville County, SC", "Abbeville County, SC"
), Cluster = c("Aerospace Vehicles and Defense", "Agricultural Inputs and Services", 
"Apparel", "Automotive", "Biopharmaceuticals", "Business Services", 
"Coal Mining", "Communications Equipment and Services", "Construction Products and Services", 
"Distribution and Electronic Commerce"), strength = c(0, 0, 0, 
1, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame")

I want to eventually get it to the form of one observation per county with 50 columns per observation so it would look like

  County  County_fips Aerospace Apparel Agriculture
       a      123          1     0       1
       b      234          0     0       0

The tricky part is that some counties are missing the numeric id (fips code) and I've had problems in the past with R being able to distinguish text. Any tips?

joels · April 20, 2020, 11:16pm

Do you have counties where the FIPS code for that county is missing in every row for that county? Or do you have cases where FIPS code is missing for some rows for a given county, but not others? In that case, you can fill the missing values with the previous non-missing value. In either of the cases above, the code below should work (after filling NA values, if needed). Can you provide a data sample that has the missingness problems that affect your actual use case?

library(tidyverse)

df %>% 
  pivot_wider(names_from=Cluster, values_from=strength)

Erica · April 20, 2020, 11:57pm

awesome, this is great!

system · April 27, 2020, 11:57pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.