Repeat function on dataframe with multiple factors--sapply?

Hello, I have a function that I would like to run on a dataframe containing multiple factors. I have a flow from three sites (A,B,C). I simply want to run this function ("RBFlash") that provides an index of the flashiness of the flow.

I could make the data wide, and run the function separately on each site, but I think I can do it with sapply, but I keep throwing an error.

Can someone provide help on this? Thank you.

Data:

datapasta::tribble_paste(data.Q2)
#> Error in is.data.frame(input_table): object 'data.Q2' not found
tibble::tribble(
  ~site, ~flow,
    "A",    6L,
    "A",    2L,
    "A",    4L,
    "A",    6L,
    "A",    7L,
    "A",    4L,
    "A",    6L,
    "A",    5L,
    "A",    8L,
    "B",   23L,
    "B",   26L,
    "B",   33L,
    "B",   34L,
    "B",   24L,
    "B",   50L,
    "B",   45L,
    "B",   29L,
    "B",   30L,
    "B",   42L,
    "C",   86L,
    "C",  114L,
    "C",  230L,
    "C",  173L,
    "C",  187L,
    "C",   88L,
    "C",  206L,
    "C",  168L,
    "C",  153L,
    "C",   86L
  )
#> # A tibble: 29 × 2
#>    site   flow
#>    <chr> <int>
#>  1 A         6
#>  2 A         2
#>  3 A         4
#>  4 A         6
#>  5 A         7
#>  6 A         4
#>  7 A         6
#>  8 A         5
#>  9 A         8
#> 10 B        23
#> # … with 19 more rows

Created on 2022-10-30 by the reprex package (v2.0.1)

R-script:

install.packages("remotes")

remotes::install_github("mccreigh/rwrfhydro")
#> Skipping install of 'rwrfhydro' from a github remote, the SHA1 (f6e03a41) has not changed since last install.
#>   Use `force = TRUE` to force installation

library("rwrfhydro")
#> To check rwrfhydro updates run: CheckForUpdates()

RBI <- RBFlash(data.Q2$flow, na.rm = TRUE)
#> Error in diff(m): object 'data.Q2' not found

RBI.all <- sapply(data.Q2$flow, RBFlash, na.rm = TRUE)
#> Error in lapply(X = X, FUN = FUN, ...): object 'dataQ2' not found

Created on 2022-10-30 by the reprex package (v2.0.1)

Hello,

your error always says that the object is not defined. Are you sure you have successfully named a variable data.Q2? If I run your code, everything is fine. But note that you have to use tapply() in your case:

RBI.all <- tapply(data.Q2$flow,data.Q2$site, RBFlash, na.rm = TRUE)

Kind regards

1 Like

@FactOREO ---Thank you so much!!! It worked perfect. I need to dig in to figure out when to use tapply versus sapply!

@FactOREO I am trying to run a similar script, with a different dataset and function, and keep getting an error thrown at me. Do you have any ideas? Thank you.

A little bit of the data:

tibble::tribble(
                              ~date.well.combined2.parm.code.....adj,
                     "1 2008-10-09           MW01       nox  0.0075",
                     "2 2008-10-09           MW07       nox  1.7000",
                     "3 2008-10-10           MW11       nox  4.6000",
                     "4 2008-10-10           MW22       nox  0.1900",
                     "5 2008-10-10           SW01       nox  1.4000",
                     "6 2008-10-21           MW04       nox 12.0000"
                                                     
                     )
#> # A tibble: 6 × 1
#>   date.well.combined2.parm.code.....adj        
#>   <chr>                                        
#> 1 1 2008-10-09           MW01       nox  0.0075
#> 2 2 2008-10-09           MW07       nox  1.7000
#> 3 3 2008-10-10           MW11       nox  4.6000
#> 4 4 2008-10-10           MW22       nox  0.1900
#> 5 5 2008-10-10           SW01       nox  1.4000
#> 6 6 2008-10-21           MW04       nox 12.0000

Created on 2022-11-07 by the reprex package (v2.0.1)

I tried running the script two ways: one using "goup_by" and the other your "tapply" method:

outliers.grubb <- high %>%
  group_by(well.combined2) %>%
    sapply(adj, grubbs.test, na.rm = T
                        )
#> Error in high %>% group_by(well.combined2) %>% sapply(adj, grubbs.test, : could not find function "%>%"

outliers.grubb <- tapply(high$well.combined2, high$adj, grubbs.test, na.rm =T)
#> Error in tapply(high$well.combined2, high$adj, grubbs.test, na.rm = T): object 'grubbs.test' not found

Created on 2022-11-07 by the reprex package (v2.0.1)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.