top of page

Assignment 2: Pitching Statistics

For assignment two, we worked on understanding basic pitching metrics and using learning about how to filter specific data, program functions, and create scatter plots. For pitching metrics, we showed how to calculate innings pitched, earned run average, walks plus hits per inning pitched, and strikeout/walks.

When working with R, we plotted win percentage verse home runs allowed, earned runs average, and walks plus hits per inning pitcher as 3 different scatterplots. 

Lastly, we also created algorithms to be able to calculate when a ball was in a strike zone and the percentage of strikes that a pitcher threw.

Download

library(readr)

library(tidyverse)
library(ggpubr)

TPD2000_2019 <- read_csv("MATH 494/Teams Pitching Data 2000 - 2019.csv")

#Part A1: Calculating Win Percent
TPD2000_2019$WinPerc <- round(TPD2000_2019$W / TPD2000_2019$G, 3)

#Part A2: Calculating WHIP
IP <- TPD2000_2019$IPouts / 3
BBHBPH <- TPD2000_2019$BBA + TPD2000_2019$HA
TPD2000_2019$WHIP <- round(BBHBPH / IP, 3)

#Part B1: Creating Scatter plots
wp_v_HRA<- ggplot(TPD2000_2019, aes(HRA, WinPerc)) + 
                       geom_point(color="purple") +
                       geom_smooth(method = "lm", se=FALSE, color="#FF99FF") +
                       labs(title="Win Percentage vs Home Runs Allowed",

                               x="Home Runs Allowed", y="Winning Percentage") +
                       stat_cor(aes(label=..r.label..),label.x=90, label.y=0.8) +
                       stat_cor(aes(label=..rr.label..), label.x=90, label.y=0.745)

wp_v_ERA<- ggplot(TPD2000_2019, aes(ERA, WinPerc)) + 
                       geom_point(color="blue") +
                       geom_smooth(method = "lm", se=FALSE, color="#FF99FF") +
                       labs(title="Win Percentage vs Team ERA",

                               x="Team ERA", y="Winning Percentage") +
                       stat_cor(aes(label=..r.label..),label.x=5, label.y=0.8) +
                       stat_cor(aes(label=..rr.label..), label.x=5, label.y=0.745)

 wp_v_WHIP<- ggplot(TPD2000_2019, aes(WHIP, WinPerc)) + 
                          geom_point(color="green") +
                          geom_smooth(method = "lm", se=FALSE, color="#FF99FF") +
                          labs(title="Win Percentage vs WHIP",

                                  x="WHIP", y="Winning Percentage") +
                          stat_cor(aes(label=..r.label..),label.x=1.15, label.y=0.8) +
                          stat_cor(aes(label=..rr.label..), label.x=1.15, label.y=0.745)

 

ggarrange(wp_v_HRA, wp_v_ERA, wp_v_WHIP, ncol=2, nrow=2)

Assignment 2 Code and Results

_f9260805-8b98-432c-9b91-0b086d32ff87_ed

Anastashia Pelletier

©2022 by Anastashia Pelletier. Proudly created with Wix.com

bottom of page