Newbie here and lesser experience in R. Would appreciate anyone's thoughts on this. I am using the Lahman package in R looking at baseball stats. For this inquiry I am specifically looking at the Pitching table.
I'm a bit tripped up and know what I want to do and am not sure how to do it. I think I need to use MUTATE to create a new column summing up all GS per pitcher per team from seasons 1871-2021. Code I have thus far is below.
library(tidyverse)
library(dplyr)
library(tidyr)
View(LahmanData)
View(Pitching)
#To see how many total pitchers had a GS on the 2011 Milwaukee Brewers, 6 total
MIL <- filter(Pitching, yearID == 2011, teamID == "MIL", GS > 0)
#This below lists any pitcher with a GS from 1871-2021, throwing 2022 out
GSPitching <- filter(Pitching, yearID < 2022, GS > 0)
as_tibble(GSPitching)```
As you can see from the 2011 Milwaukee Brewers example, I am wanting to total the amount of pitchers that had a GS per team, per team. 2011 Brewers had 6 as seen in MIL dataframe.
GSPitching dataframe has all pitchers with > 0 GS, therefore appearing in at least one game as a starting pitcher for their team.
What is the best way to sum this up per team and per year? I think it's some type of mutate summing the count of GS but how do you get there per team, per pitcher in each distinct year?
Wanting to get that end product in a dataframe so, for example, I can see how many pitchers had a GS for any team in those years summed up (e.g. the 1973 Tigers, 2003 Astros) before then making some visualizations for it.
Appreciate any guidance here.