Hi, I was wondering if anyone can help me separate certain words from the "des" column and put it into a new column. I was hoping to take out just the location of the hit part. For example, in the first one I would just want "right center field". Another example is "left field".
homeruns <- data.frame(
des = as.factor(c("Trevor Story homers (37) on a fly ball to right center field. ", "Nolan Arenado homers (38) on a fly ball to left field. ",
"Max Muncy homers (35) on a fly ball to left center field. Joc Pederson scores. ",
"Cody Bellinger homers (25) on a fly ball to right center field. Max Muncy scores. ",
"Anthony Rizzo homers (25) on a fly ball to right field. ",
"Travis Shaw homers (32) on a line drive to right field. ",
"Taylor Ward homers (6) on a fly ball to left center field. Jefry Marte scores. ",
"Austin Barnes homers (4) on a fly ball to left center field. ",
"Trevor Story homers (36) on a fly ball to center field. ",
"Nolan Arenado homers (37) on a line drive to left field. ",
"Max Kepler homers (20) on a fly ball to right field. Mitch Garver scores. ",
"Franklin Barreto homers (5) on a fly ball to center field. Matt Olson scores. ",
"Willson Contreras homers (10) on a line drive to left field. Kris Bryant scores. ")),
game_date = as.factor(c("10/1/2018", "10/1/2018", "10/1/2018", "10/1/2018",
"10/1/2018", "9/30/2018", "9/30/2018",
"9/30/2018", "9/30/2018", "9/30/2018",
"9/30/2018", "9/30/2018", "9/30/2018"))
)
homeruns <- data.frame(
des = as.factor(c("Trevor Story homers (37) on a fly ball to right center field. ", "Nolan Arenado homers (38) on a fly ball to left field. ",
"Max Muncy homers (35) on a fly ball to left center field. Joc Pederson scores. ",
"Cody Bellinger homers (25) on a fly ball to right center field. Max Muncy scores. ",
"Anthony Rizzo homers (25) on a fly ball to right field. ",
"Travis Shaw homers (32) on a line drive to right field. ",
"Taylor Ward homers (6) on a fly ball to left center field. Jefry Marte scores. ",
"Austin Barnes homers (4) on a fly ball to left center field. ",
"Trevor Story homers (36) on a fly ball to center field. ",
"Nolan Arenado homers (37) on a line drive to left field. ",
"Max Kepler homers (20) on a fly ball to right field. Mitch Garver scores. ",
"Franklin Barreto homers (5) on a fly ball to center field. Matt Olson scores. ",
"Willson Contreras homers (10) on a line drive to left field. Kris Bryant scores. ")),
game_date = as.factor(c("10/1/2018", "10/1/2018", "10/1/2018", "10/1/2018",
"10/1/2018", "9/30/2018", "9/30/2018",
"9/30/2018", "9/30/2018", "9/30/2018",
"9/30/2018", "9/30/2018", "9/30/2018"))
)
library(tidyverse)
homeruns %>%
mutate(des = as.character(des),
location = str_extract(des, "(?<=\\sto\\s)[^\\.]+(?=\\.)")) %>%
select(location)
#> location
#> 1 right center field
#> 2 left field
#> 3 left center field
#> 4 right center field
#> 5 right field
#> 6 right field
#> 7 left center field
#> 8 left center field
#> 9 center field
#> 10 left field
#> 11 right field
#> 12 center field
#> 13 left field
If you do not mind, can you help me with this issue too? I have the average dimensions of an MLB stadium and I was hoping to see how many home runs a player would have at this hypothetical park. I am assuming I would you an if statement, but I am not that experienced in that area. Thanks!