Applying Machine Learning Techniques to Baseball Pitch Prediction

Michael Hamilton, Phuong Hoang, Lori Layne, Joseph Murray, David Padget, Corey Stafford, Hien Tran

2014

Abstract

Major League Baseball, a professional baseball league in the US and Canada, is one of the most popular sports leagues in North America. Partially because of its popularity and the wide availability of data from games, baseball has become the subject of significant statistical and mathematical analysis. Pitch analysis is especially useful for helping a team better understand the pitch behavior it may face during a game, allowing the team to develop a corresponding batting strategy to combat the predicted pitch behavior. We apply several common machine learning classification methods to PITCH f/x data to classify pitches by type. We then extend the classification task to prediction by utilizing features only known before a pitch is thrown. By performing significant feature analysis and introducing a novel approach for feature selection, moderate improvement over former results is achieved.

References

  1. Fawcett, T. (2006). An introduction to ROC analysis. Pattern recognition letters, 27(8):861-874.
  2. Ganeshapillai, G. and Guttag, J. (2012). Predicting the next pitch. In MIT Sloan Sports Analytics Conference.
  3. Hopkins, T. and Magel, R. (2008). Slugging percentage in differing baseball couns. Journal of Quantitative Analysis in Sports, 4(2):1136.
  4. MATLAB (2013). MATLAB documentation.
  5. MLB (2012). Major league baseball attendance records. Retrieved June 19, 2013 from htt p : ==espn:go:com=mlb=attendance==year=2012.
  6. Pitchf/x (2013). MLB pitch f/x data. Retrieved July, 2013 from htt p : ==www:mlb:com.
  7. SVM (2013). Support vector machines explained. Retrieved July 3, 2013 from htt p : ==www:tristan f letcher:co:uk.
  8. Theodoridis, S. and Koutroumbas, K. (2009). Pattern recognition, fourth edition. Academic Press, Burlington, Mass., 4th edition.
  9. Weinstein-Gould, J. (2009). Keeping the hitter off balance: Mixed strategies in baseball. Journal of Quantitative Analysis in Sports, 5(2):1173.
  10. Wikipedia (2013). Wikipedia glossary of baseball. Retrieved July, 2013 from htt p : ==en:wikipedia:org=wiki=Glossary o f baseball.
  11. 3. SVM-G: Support Vector Machine with RBF (Gaussian) kernel (SVM, 2013)
Download


Paper Citation


in Harvard Style

Hamilton M., Hoang P., Layne L., Murray J., Padget D., Stafford C. and Tran H. (2014). Applying Machine Learning Techniques to Baseball Pitch Prediction . In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-018-5, pages 520-527. DOI: 10.5220/0004763905200527


in Bibtex Style

@conference{icpram14,
author={Michael Hamilton and Phuong Hoang and Lori Layne and Joseph Murray and David Padget and Corey Stafford and Hien Tran},
title={Applying Machine Learning Techniques to Baseball Pitch Prediction},
booktitle={Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2014},
pages={520-527},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004763905200527},
isbn={978-989-758-018-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Applying Machine Learning Techniques to Baseball Pitch Prediction
SN - 978-989-758-018-5
AU - Hamilton M.
AU - Hoang P.
AU - Layne L.
AU - Murray J.
AU - Padget D.
AU - Stafford C.
AU - Tran H.
PY - 2014
SP - 520
EP - 527
DO - 10.5220/0004763905200527