Issues using summarize()

bryanrt · March 13, 2021, 6:48pm

Goal

Not sure why I haven't been able to get this to work but I have a data frame similar to this overly simplified example,

  type sec station   wt
1    0  43     DAG 0.80
2    1  92     DAG 0.24
3   31  31     DAG   NA
4    0  22    CBET 0.92
5    1  52    CBET 0.10
6   31  61    CBET   NA

and I am trying to get, essentially, to here

  station  sec_0 wt_0 sec_1 wt_1 sec_31 wt_31
1     DAG     43 0.80    92 0.24     31    NA 
2    CBET     22 0.92    52 0.10     61    NA

I know it's probably super simple but when I've tried putting conditional statements inside the summarize it doesn't want to work.

REPREX

reprex_data <- read_csv(file = "https://gitlab.com/Bryanrt-geophys/sac2eqtransformr/-/raw/master/sample_data/pre_stead.csv", col_names = T)

reprex _data %>%
  group_by(station) %>%
  summarize(
    sec_0 = if(type == 0){
      sec
    } else (NA),
    wt_0 = if(type == 0){
      wt
    } else (NA),
    sec_1 = if(type == 1){
      sec
    } else (NA),
    wt_1 = if(type == 1){
      wt
    } else (NA),
    sec_31 = if(type == 31){
      sec
    } else (NA),
    wt_31 = if(type == 31){
      wt
    } else (NA)
  )

This is just a basic example to avoid long convoluted instructions. I have provided my real data below. Any and all help is appreciated. I am trying to split the columns sec, wt, ain, obs_arv, obs_trv, resid, and sta_cor.

Actual Data

actual_data <- read_csv(file = "https://gitlab.com/Bryanrt-geophys/sac2eqtransformr/-/raw/master/sample_data/pre_stead.csv", col_names = T)

FJCC · March 13, 2021, 7:18pm

I would use functions from tidyr rather than summarize()

library(readr)
library(tidyr)
reprex_data <- read_csv(file = "https://gitlab.com/Bryanrt-geophys/sac2eqtransformr/-/raw/master/sample_data/pre_stead.csv", col_names = T)
#> Parsed with column specification:
#> cols(
#>   type = col_double(),
#>   sec = col_double(),
#>   station = col_character(),
#>   wt = col_double()
#> )
tmp <- pivot_longer(reprex_data, cols = c("sec", "wt"))
pivot_wider(tmp, names_from = c("name", "type"), values_from = value)
#> # A tibble: 2 x 7
#>   station sec_0  wt_0 sec_1  wt_1 sec_31 wt_31
#>   <chr>   <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>
#> 1 DAG        43  0.8     92  0.24     31    NA
#> 2 CBET       22  0.92    52  0.1      61    NA

^{Created on 2021-03-13 by the reprex package (v0.3.0)}

bryanrt · March 13, 2021, 8:18pm

Thank you for the directions! I have tried applying this to my actual data, but to no avail. I am not sure if it's because of the NA's in some of my columns. Do you mind taking a look and letting me know what you think? I am trying to split the columns for sec, wt, ain, obs_arv, obs_trv, resid, and sta_cor.

Actual Data

actual_data <- read_csv(file = "https://gitlab.com/Bryanrt-geophys/sac2eqtransformr/-/raw/master/sample_data/pre_stead.csv", col_names = T)

bryanrt · March 13, 2021, 8:56pm

Okay, I got this work with, though I am having to remove certain columns manually that I did not want to be duplicated. But all in all, it got me to where I needed to be able to get the job done. Thank you kindly @ FJCC.

test <- pivot_longer(pre_stead, cols = c("uncertanty_multiply_quarter_sec",
                                         "obs_p_arrival",
                                         "sec",
                                         "fm_qual",
                                         "wt",                             
                                         "delta",
                                         "azm",
                                         "ain",
                                         "obs_arv",                        
                                         "obs_trv",
                                         "theo_trv",                       
                                         "resid",  
                                         "sta_cor",
                                         "DIST",
                                         "AZM",                            
                                         "AIN",
                                         "HRMN",
                                         "P_RES",                          
                                         "DATE",
                                         "Time",                          
                                         "ORIGIN",
                                         "LAT_deg",                        
                                         "LAT_min",
                                         "LONG_deg",                       
                                         "LONG_min",
                                         "DEPTH",                          
                                         "MAG",
                                         "NO_DM",                          
                                         "GAP_M",
                                         "RMS",                            
                                         "ERH",
                                         "ERZ",
                                         "reciever_latitude",              
                                         "reciever_longitude",
                                         "reciever_elevation_m"))
test2 <- pivot_wider(test, names_from = c("name", "type"), values_from = value)

system · March 20, 2021, 8:56pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.