Aggregate columns for specific groups

Hi,

Using the iris dataset as an example (a small subset below), I need to aggregate (sum) the Petal.Width for each species:

Petal.Width. Species
0.2 setosa
0.2 setosa
0.2 setosa
1.4 versicolor
1.5 versicolor
1.5 versicolor

To result in:
Species Count
setosa 0.6
versicolor 4.4

I've tried various ways of doing this without success. Does anyone have any ideas?

Many thanks for your help

Here is a solution using the dplyr package.

library(dplyr)

SpeciesSums <- iris %>% group_by(Species) %>% summarize(SUMS = sum(Petal.Width))
SpeciesSums
#> # A tibble: 3 × 2
#>   Species     SUMS
#>   <fct>      <dbl>
#> 1 setosa      12.3
#> 2 versicolor  66.3
#> 3 virginica  101.

Created on 2022-04-26 by the reprex package (v0.2.1)

Brilliant, that's great! I have a wide data frame with many variables, do you know how I can specify a range of variables to sum()?

Many thanks again

#by variable property
(SpeciesSums <- iris %>% 
  group_by(Species) %>% 
  summarize(across(where(is.numeric),~sum(.x))))

#by explicit names

vars_to_do <- c("Sepal.Length" ,
                "Sepal.Width" )

(SpeciesSums <- iris %>% 
    group_by(Species) %>% 
    summarize(across(any_of(vars_to_do),~sum(.x))))

That's brilliant, thank you so much! I've been battling with this all day and now its fixed.

Much appreciated

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.