Abstract
Recent measurement technologies enable us to analyze baseball at higher levels of complexity. There are, however, still many unclear points around pitching strategy. There are two elements that make it difficult to measure the effect of a pitching strategy. First, most public datasets do not include location data where the catcher demands a ball, which is essential information to obtain the battery’s intent. Second, there are many confounders associated with pitching/batting results when evaluating pitching strategy. We here clarify the effect of pitching attempts to a specific location, e.g., inside or outside. We employ a causal inference framework called stratified analysis using a propensity score to evaluate the effects while removing the effect of confounding factors. We use a pitch-by-pitch dataset of Japanese professional baseball games held in 2014–2019, which includes location data where the catcher demands a ball. The results reveal that an outside pitching attempt is more effective than an inside one to minimize allowed run average. In addition, the stratified analysis shows that the outside pitching attempt is effective regardless of the magnitude of the estimated batter’s ability, and the proportion of pitched inside for pitcher/batter. Our analysis provides practical insights into selecting a pitching strategy to minimize allowed runs.
Funding source: Japan Society for the Promotion of Science
Award Identifier / Grant number: 19H04941 and 20H04075
Funding source: Precursory Research for Embryonic Science and Technology
Award Identifier / Grant number: JPMJPR20CA
Acknowledgments
The data was provided by Delta Inc.
-
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: This work was supported by JSPS KAKENHI (Grant Numbers 19H04941 and 20H04075) and JST PRESTO (JPMJPR20CA).
-
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
Austin, P. C., and E. A. Stuart. 2015. “Moving Towards Best Practice When Using Inverse Probability of Treatment Weighting (IPTW) Using the Propensity Score to Estimate Causal Treatment Effects in Observational Studies.” Statistics in Medicine 34 (28): 3661–79. https://doi.org/10.1002/sim.6607.Search in Google Scholar PubMed PubMed Central
Beneventano, P., P. D. Berger, and B. D. Weinberg. 2012. “Predicting Run Production and Run Prevention in Baseball: The Impact of Sabermetrics.” International Journal of Business, Humanities and Technology 2 (4): 67–75.Search in Google Scholar
Bock, J. R. 2015. “Pitch Sequence Complexity and Long-Term Pitcher Performance.” Sports 3 (1): 40–55. https://doi.org/10.3390/sports3010040.Search in Google Scholar
Bock, J. R., A. Maewal, and D. A. Gough. 2012. “Hitting is Contagious in Baseball: Evidence from Long Hitting Streaks.” PLoS One 7 (12): e51367. https://doi.org/10.1371/journal.pone.0051367.Search in Google Scholar PubMed PubMed Central
Cannas, M., and B. Arpino. 2019. “A Comparison of Machine Learning Algorithms and Covariate Balance Measures for Propensity Score Matching and Weighting.” Biometrical Journal 61 (4): 1049–72. https://doi.org/10.1002/bimj.201800132.Search in Google Scholar PubMed
Click, J., and J. Keri. 2006. Baseball Between the Numbers: Why Everything You Know about the Game is Wrong. New York: Perseus Books Group.Search in Google Scholar
Cole, S. R., and M. A. Hernán. 2008. “Constructing Inverse Probability Weights for Marginal Structural Models.” American Journal of Epidemiology 168 (6): 656–64. https://doi.org/10.1093/aje/kwn164.Search in Google Scholar PubMed PubMed Central
Costa, G. B., M. R. Huber, and J. T. Saccoman. 2012. Reasoning with Sabermetrics: Applying Statistical Science to Baseball’s Tough Questions. Jefferson, NC: McFarland.Search in Google Scholar
Damluji, A. A., K. Bandeen-Roche, C. Berkower, C. M. Boyd, M. S. Al-Damluji, M. G. Cohen, D. E. Forman, R. Chaudhary, G. Gerstenblith, J. D. Walston, J. R. Resar, and M. Moscucci. 2019. “Percutaneous Coronary Intervention in Older Patients with St-Segment Elevation Myocardial Infarction and Cardiogenic Shock.” Journal of the American College of Cardiology 73 (15): 1890–900. https://doi.org/10.1016/j.jacc.2019.01.055.Search in Google Scholar PubMed PubMed Central
Davies, M. A., and D. Basco. 2010. “The Many Flavors of Dips: A History and an Overview.” Baseball Research Journal 39 (2): 41–50.Search in Google Scholar
Fujii, K., K. Takeuchi, A. Kuribayashi, N. Takeishi, Y. Kawahara, and K. Takeda. 2022. “Estimating Counterfactual Treatment Outcomes over Time in Complex Multi-Agent Scenarios.” arXiv preprint arXiv:2206.01900.10.1145/3557915.3560941Search in Google Scholar
Gibbs, C., R. Elmore, and B. Fosdick. 2020. “The Causal Effect of a Timeout at Stopping an Opposing Run in the Nba.” arXiv preprint arXiv:2011.11691.Search in Google Scholar
Gray, R. 2002. “Behavior of College Baseball Players in a Virtual Batting Task.” Journal of Experimental Psychology: Human Perception and Performance 28 (5): 1131. https://doi.org/10.1037/0096-1523.28.5.1131.Search in Google Scholar
Harrison, W. K., and J. L. Salmon. 2019. “Leveraging Pitcher-Batter Matchups for Optimal Game Strategy.” In MIT Sloan Sports Analytics Conference. Also available at https://www.sloansportsconference.com/research-papers/leveraging-batter-pitcher-matchups-for-optimal-game-strategy.Search in Google Scholar
Haukoos, J. S., and R. J. Lewis. 2005. “Advanced Statistics: Bootstrapping Confidence Intervals for Statistics with “Difficult” Distributions.” Academic Emergency Medicine 12 (4): 360–5. https://doi.org/10.1197/j.aem.2004.11.018.Search in Google Scholar PubMed
Healey, G. 2015. “Modeling the Probability of a Strikeout for a Batter/Pitcher Matchup.” IEEE Transactions on Knowledge and Data Engineering 27 (9): 2415–23. https://doi.org/10.1109/tkde.2015.2416735.Search in Google Scholar
Herrlin, D. L. 2015. Forecasting MLB Performance Utilizing a Bayesian Approach in Order to Optimize a Fantasy Baseball Draft. Also available at https://digitallibrary.sdsu.edu/islandora/object/sdsu%3A2238.Search in Google Scholar
Hoang, P., M. Hamilton, J. Murray, C. Stafford, and H. Tran. 2015. “A Dynamic Feature Selection Based Lda Approach to Baseball Pitch Prediction.” In Trends and Applications in Knowledge Discovery and Data Mining, 125–37. Springer. Also available at https://link.springer.com/chapter/10.1007/978-3-319-25660-3_11.10.1007/978-3-319-25660-3_11Search in Google Scholar
Holland, P. W. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81 (396): 945–60. https://doi.org/10.1080/01621459.1986.10478354.Search in Google Scholar
James, B. 2010. The New Bill James Historical Baseball Abstract. New York: Simon & Schuster.Search in Google Scholar
Koseler, K., and M. Stephan. 2017. “Machine Learning Applications in Baseball: A Systematic Literature Review.” Applied Artificial Intelligence 31 (9–10): 745–63.10.1080/08839514.2018.1442991Search in Google Scholar
Künzel, S. R., J. S. Sekhon, P. J. Bickel, and B. Yu. 2019. “Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning.” Proceedings of the National Academy of Sciences 116 (10): 4156–65. https://doi.org/10.1073/pnas.1804597116.Search in Google Scholar PubMed PubMed Central
Lewis, M. 2004. Moneyball: The Art of Winning an Unfair Game. New York: WW Norton & Company.Search in Google Scholar
Lundberg, S. M., and S.-I. Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” arXiv preprint arXiv:1705.07874.Search in Google Scholar
Martin, E. P. 2019. “Predicting Major League Baseball Strikeout Rates from Differences in Velocity and Movement Among Player Pitch Types.” In MIT Sloan Sports Analytics Conference. Also available at https://www.sloansportsconference.com/research-papers/predicting-major-league-baseball-strikeout-rates-from-differences-in-velocity-and-movement-among-player-pitch-types.Search in Google Scholar
MLB.com. 2015a. Expected Era (xera). Also available at https://www.mlb.com/glossary/statcast/expected-era (accessed July 5, 2021).Search in Google Scholar
MLB.com. 2015b. Expected Weighted On-Base Average (xwoba). Also available at https://www.mlb.com/glossary/statcast/expected-woba (accessed July 5, 2021).Search in Google Scholar
MLB.com. 2017. “Air Ball Revolution” Rewards Hard Elevation. Also available at https://www.mlb.com/news/mlb-air-ball-revolution-requires-hard-hit-ball-c234596050.Search in Google Scholar
Nakahara, H., K. Takeda, and K. Fujii. 2022a. “Estimating the Effect of Team Hitting Strategies Using Counterfactual Virtual Simulation in Baseball.” International Journal of Computer Science in Sport 22 (1): 1–12. https://doi.org/10.2478/ijcss-2023-0001.Search in Google Scholar
Nakahara, H., K. Takeda, and K. Fujii. 2022b. Evaluating a Third Base Coach’s Decision Making via Game Theory and Machine Learning. Reading, UK: University of Reading, MathSport International.Search in Google Scholar
Robins, J. M., M. A. Hernan, and B. Brumback. 2000. Marginal Structural Models and Causal inference in Epidemiology. Also available at https://journals.lww.com/epidem/fulltext/2000/09000/marginal_structural_models_and_causal_inference_in.11.aspx.10.1097/00001648-200009000-00011Search in Google Scholar PubMed
Rosenbaum, P. R., and D. B. Rubin. 1985. “Constructing a Control Group Using Multivariate Matched Sampling Methods that Incorporate the Propensity Score.” The American Statistician 39 (1): 33–8. https://doi.org/10.1080/00031305.1985.10479383.Search in Google Scholar
Rubin, D. B. 1997. “Estimating Causal Effects from Large Data Sets Using Propensity Scores.” Annals of Internal Medicine 127 (8_Part_2): 757–63. https://doi.org/10.7326/0003-4819-127-8_part_2-199710151-00064.Search in Google Scholar PubMed
Shinya, M., S. Tsuchiya, Y. Yamada, K. Nakazawa, K. Kudo, and S. Oda. 2017. “Pitching Form Determines Probabilistic Structure of Errors in Pitch Location.” Journal of Sports Sciences 35 (21): 2142–7. https://doi.org/10.1080/02640414.2016.1258484.Search in Google Scholar PubMed
Silver, N. 2003. “Introducing Pecota.” Baseball Prospectus (at Columbus, Ohio) 2003: 507–14.Search in Google Scholar
Sawchik, T. 2017a. Can more MLB Hitters Get off the Ground? Also available at https://blogs.fangraphs.com/can-more-mlb-hitters-get-off-the-ground/.Search in Google Scholar
Sawchik, T. 2017b. Has the Fly-Ball Revolution Begun? Also available at https://blogs.fangraphs.com/has-the-fly-ball-revolution-begun/.Search in Google Scholar
Sheinin, D. 2017. These Days in Baseball, Every Batter is Trying to Find an Angle. Also available at https://www.washingtonpost.com/graphics/sports/mlb-launch-angles-story/?utm_term=.132ca7d69bad.Search in Google Scholar
Tango, T. M., M. G. Lichtman, and A. E. Dolphin. 2007. The Book: Playing the Percentages in Baseball. Sterling, VA: Potomac Books, Inc.Search in Google Scholar
Thorn, J., P. Palmer, and D. Reuther. 2015. The Hidden Game of Baseball: A Revolutionary Approach to Baseball and its Statistics. Chicago, IL: University of Chicago Press.10.7208/chicago/9780226276830.001.0001Search in Google Scholar
Toda, K., M. Teranishi, K. Kushiro, and K. Fujii. 2022. “Evaluation of Soccer Team Defense Based on Prediction Models of Ball Recovery and Being Attacked: A Pilot Study.” PLoS One 17 (1): e0263051. https://doi.org/10.1371/journal.pone.0263051.Search in Google Scholar PubMed PubMed Central
Toumi, A., and M. Lopez. 2019. “From Grapes and Prunes to Apples and Apples: Using Matched Methods to Estimate Optimal Zone Entry Decision-Making in the National Hockey League.” In Carnegie Mellon Sports Analytics Conference 2019.Search in Google Scholar
Vock, D. M., and L. F. B. Vock. 2018. “Estimating the Effect of Plate Discipline Using a Causal Inference Framework: An Application of the G-Computation Algorithm.” Journal of Quantitative Analysis in Sports 14 (2): 37–56.10.1515/jqas-2016-0029Search in Google Scholar
Woolner, K. 2002. Aim for the Head: Simulating Catcher’s Era. Also available at http://www.soumu.go.jp/menu_news/s-news/01tsushin02_02000072.html.Search in Google Scholar
Wu, L. Y., A. J. Danielson, X. J. Hu, and T. B. Swartz. 2021. “A Contextual Analysis of Crossing the Ball in Soccer.” Journal of Quantitative Analysis in Sports 17 (1): 57–66.10.1515/jqas-2020-0060Search in Google Scholar
Yam, D. R., and M. J. Lopez. 2019. “What Was Lost? A Causal Estimate of Fourth Down Behavior in the National Football League.” Journal of Sports Analytics 5 (3): 153–67. https://doi.org/10.3233/jsa-190294.Search in Google Scholar
© 2023 Walter de Gruyter GmbH, Berlin/Boston