Hi, and welcome. This is a question that straddles the territory for when you need a reproducible example, called a reprex to be of help and those were you can provide some helpful general advice.
The answer to your question does not require all the variables, you need only TEAM, WIN, ERRORS and DATE. You have to construct the date from the DAY, MONTH and YEAR variables to get a date object. See the zoo
and lubridate
packages.
Put your database query to select the pieces, and cobble together DATE. If you are reading from a csv file, just bring it all into a data frame or tibble. We'll call this raw_data
Using the dplyr
package
softball <- raw_data %>% select(TEAM, ERRORS, WIN, DAY, MONTH, YEAR) %>% filter(WIN == TRUE) %>% arrange(desc(ERRORS))
If you've coded WIN as 1/0 YES/NO or some other make the appropriate adjustments to the filter argument.
This is a classic divide and conquer problem. You don't care about runs or any of the other possible baseball statistics in your dataset, so put them to one side. You don't care about games that a team lost or tied, so extract only the wins. Now you want to find the highest number of errors in what's left and the corresponding date. This is essentially what analysis is all about -- taking complicated problems and dividing them into bite size pieces.
The hardest part of this will actually be constructing the DATE out of DAY MONTH YEAR. Hint: Use dplyr::mutate.
Good luck!