How do I sum the values for rows with specific string matches

dantheman · July 25, 2021, 11:53am

Hi All

I have a very large Excel table with data in two columns as follows. My goal is to reduce the number of entries by grouping the results. For example, I want to sum the kg of CO2 produced in the use of "Heat". I am only interested in the values where "Heat" is the first word in the string. I expect to have 1000+300=1300 for Heat. I am unsure how to go about this. Below are my steps so far.

Afterwards, I plan to create a bar chart showing how much CO2 each process grouping creates.

I have to do this for about 30 Excel Tables and produce 30 bar charts. Maybe there is a way to reduce the workload?

Process & Mass of CO2
Heat & 1000
Electricity_and_heat & 2000
Electricity & 1000
Heat_China & 300
China_heat & 1000

##Select rows containing "Heat" - but this does not mean that it is at the start of the string
library(readxl)
Book1 <- read_excel("Book1.xlsx")
View(Book1)
data2 <- Book1[str_detect(Book1$Proces, "Heat"), ]
data2

Please help. Thanks in advance!

andresrcs · July 25, 2021, 1:32pm

Just specify that you want the start of the string with this metacharacter "^"

data2 <- Book1[str_detect(Book1$Proces, "^Heat"), ]

dantheman · July 25, 2021, 2:42pm

Thanks.

For the other part of my question: how do I sum the masses by group? i.e. how to I get to 1300kg for "Heat" or for "Electricity". I have approximately 12 groups.

andresrcs · July 25, 2021, 3:38pm

You can use dplyr functions, group_by() and summarise().

You can learn more about data wrangling in general in this free ebook

https://r4ds.had.co.nz/transform.html

Or, if you need more specific help, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

Also, please be aware that this is not supposed to be a support ticket where we follow along with all the stages of your project, this is mostly a Q & A forum so you should only ask one reasonably well defined question per topic, but you can open as many topics as you need.

system · August 15, 2021, 3:39pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.