I have a very large Excel table with data in two columns as follows. My goal is to reduce the number of entries by grouping the results. For example, I want to sum the kg of CO2 produced in the use of "Heat". I am only interested in the values where "Heat" is the first word in the string. I expect to have 1000+300=1300 for Heat. I am unsure how to go about this. Below are my steps so far.
Afterwards, I plan to create a bar chart showing how much CO2 each process grouping creates.
I have to do this for about 30 Excel Tables and produce 30 bar charts. Maybe there is a way to reduce the workload?
Process & Mass of CO2
Heat & 1000
Electricity_and_heat & 2000
Electricity & 1000
Heat_China & 300
China_heat & 1000
##Select rows containing "Heat" - but this does not mean that it is at the start of the string
library(readxl)
Book1 <- read_excel("Book1.xlsx")
View(Book1)
data2 <- Book1[str_detect(Book1$Proces, "Heat"), ]
data2
For the other part of my question: how do I sum the masses by group? i.e. how to I get to 1300kg for "Heat" or for "Electricity". I have approximately 12 groups.
Also, please be aware that this is not supposed to be a support ticket where we follow along with all the stages of your project, this is mostly a Q & A forum so you should only ask one reasonably well defined question per topic, but you can open as many topics as you need.