Aggregating data with a facet

I am looking for help to work out how to re-arrange my dataset.

I have a dataset the looks a bit like this called data:

No ID age sex location time Bacteria abundance
1 BUGA 6 F A September_2017 Bacteria.A 0.43
2 BUGB 6 M B September_2017 Bacteria.A 0.50
3 BUGC 6 F A September_2017 Bacteria.A 0.002
4 BUGD 6 F B September_2017 Bacteria.A 0.034
5 BUGE 6 F A September_2017 Bacteria.A 0.043
6 BUGF 6 M A September_2017 Bacteria.A 0.023
7 BUGA 6 F A March_2018 Bacteria.A 0.34
8 BUGB 6 M B March_2018 Bacteria.A 0.23
9 BUGA 6 F A September_2017 Bacteria.A 0.43
10 BUGB 6 M B September_2017 Bacteria.B 0.50
11 BUGC 6 F A September_2017 Bacteria.B 0.0000089
12 BUGD 6 F B September_2017 Bacteria.B 0.034
13 BUGE 6 F A September_2017 Bacteria.B 0.000079
14 BUGF 6 M A September_2017 Bacteria.B 0.00098
15 BUGA 6 F A March_2018 Bacteria.B 0.0034
16 BUGB 6 M B March_2018 Bacteria.B 0.00012
data2 <- aggregate(data$abundance, list(data$location ,data$Bacteria),mean)
head(data2)

And this gives me an output that would look a bit like:
data2

Group.1 Group.2 value
1 A Bacteria.A mean value
2 B Bacteria.A mean value
3 A Bacteria.B mean value
4 B Bacteria.B mean value

*where mean value would be a number = to the mean of all the values combined.

However, I would like this to be faceted by time-point also like this:

Group.1 Group.2 value time
1 A Bacteria.A mean value September_2017
2 B Bacteria.A mean value September_2017
3 A Bacteria.B mean value September_2017
4 B Bacteria.B mean value September_2017
5 A Bacteria.A mean value March_2018
6 B Bacteria.A mean value March_2018
7 A Bacteria.B mean value March_2018
8 B Bacteria.B mean value March_2018

Is this possible?

I've tried various pieces of code without success so any help would be much appreciated.
Thanks!
.

I think this is what you are trying to do or at least close enough to work as a starting point

library(dplyr)

# Sample data on a copy/paste friendly format (replace with your actual dataset).
sample_data <- data.frame(
  stringsAsFactors = FALSE,
                No = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16),
                ID = c("BUGA","BUGB","BUGC","BUGD",
                       "BUGE","BUGF","BUGA","BUGB","BUGA","BUGB","BUGC",
                       "BUGD","BUGE","BUGF","BUGA","BUGB"),
               age = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6),
               sex = c("F","M","F","F","F","M",
                       "F","M","F","M","F","F","F","M","F","M"),
          location = c("A","B","A","B","A","A",
                       "A","B","A","B","A","B","A","A","A","B"),
              time = c("September_2017",
                       "September_2017","September_2017","September_2017","September_2017",
                       "September_2017","March_2018","March_2018",
                       "September_2017","September_2017","September_2017",
                       "September_2017","September_2017","September_2017","March_2018",
                       "March_2018"),
          Bacteria = c("Bacteria.A","Bacteria.A",
                       "Bacteria.A","Bacteria.A","Bacteria.A","Bacteria.A",
                       "Bacteria.A","Bacteria.A","Bacteria.A","Bacteria.B",
                       "Bacteria.B","Bacteria.B","Bacteria.B","Bacteria.B",
                       "Bacteria.B","Bacteria.B"),
         abundance = c(0.43,0.5,0.002,0.034,0.043,
                       0.023,0.34,0.23,0.43,0.5,8.9e-06,0.034,7.9e-05,
                       0.00098,0.0034,0.00012)
)

sample_data %>% 
    group_by(location, Bacteria, time) %>% 
    summarise(mean_abundance = mean(abundance)) %>% 
    arrange(desc(time), Bacteria, location)
#> `summarise()` regrouping output by 'location', 'Bacteria' (override with `.groups` argument)
#> # A tibble: 8 x 4
#> # Groups:   location, Bacteria [4]
#>   location Bacteria   time           mean_abundance
#>   <chr>    <chr>      <chr>                   <dbl>
#> 1 A        Bacteria.A September_2017       0.186   
#> 2 B        Bacteria.A September_2017       0.267   
#> 3 A        Bacteria.B September_2017       0.000356
#> 4 B        Bacteria.B September_2017       0.267   
#> 5 A        Bacteria.A March_2018           0.34    
#> 6 B        Bacteria.A March_2018           0.23    
#> 7 A        Bacteria.B March_2018           0.0034  
#> 8 B        Bacteria.B March_2018           0.00012

Created on 2020-10-03 by the reprex package (v0.3.0)

For future reference, please make your questions providing a proper REPRoducible EXample (reprex) illustrating your issue (as the one above).

And if you want to learn more about data wrangling you can read this free ebook

1 Like

This also works, other than the column names needing to be adjusted.

data <- read.csv("~/R/Play/Dummy.csv")
data2 <- aggregate(data$abundance, list(data$location ,data$Bacteria, data$time),mean)
data2
#>   Group.1    Group.2        Group.3            x
#> 1       A Bacteria.A     March_2018 0.3400000000
#> 2       B Bacteria.A     March_2018 0.2300000000
#> 3       A Bacteria.B     March_2018 0.0034000000
#> 4       B Bacteria.B     March_2018 0.0001200000
#> 5       A Bacteria.A September_2017 0.1856000000
#> 6       B Bacteria.A September_2017 0.2670000000
#> 7       A Bacteria.B September_2017 0.0003559667
#> 8       B Bacteria.B September_2017 0.2670000000

Created on 2020-10-03 by the reprex package (v0.3.0)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.