Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter May 18, 2023

Pitching strategy evaluation via stratified analysis using propensity score

  • Hiroshi Nakahara , Kazuya Takeda and Keisuke Fujii ORCID logo EMAIL logo

Abstract

Recent measurement technologies enable us to analyze baseball at higher levels of complexity. There are, however, still many unclear points around pitching strategy. There are two elements that make it difficult to measure the effect of a pitching strategy. First, most public datasets do not include location data where the catcher demands a ball, which is essential information to obtain the battery’s intent. Second, there are many confounders associated with pitching/batting results when evaluating pitching strategy. We here clarify the effect of pitching attempts to a specific location, e.g., inside or outside. We employ a causal inference framework called stratified analysis using a propensity score to evaluate the effects while removing the effect of confounding factors. We use a pitch-by-pitch dataset of Japanese professional baseball games held in 2014–2019, which includes location data where the catcher demands a ball. The results reveal that an outside pitching attempt is more effective than an inside one to minimize allowed run average. In addition, the stratified analysis shows that the outside pitching attempt is effective regardless of the magnitude of the estimated batter’s ability, and the proportion of pitched inside for pitcher/batter. Our analysis provides practical insights into selecting a pitching strategy to minimize allowed runs.


Corresponding author: Keisuke Fujii, Graduate School of Informatics, Nagoya University, Nagoya, Japan; RIKEN Center for Advanced Intelligence Project, Tokyo, Japan; and PRESTO, Japan Science and Technology Agency, Kawaguchi, Japan, E-mail:

Award Identifier / Grant number: 19H04941 and 20H04075

Award Identifier / Grant number: JPMJPR20CA

Acknowledgments

The data was provided by Delta Inc.

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: This work was supported by JSPS KAKENHI (Grant Numbers 19H04941 and 20H04075) and JST PRESTO (JPMJPR20CA).

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

Austin, P. C., and E. A. Stuart. 2015. “Moving Towards Best Practice When Using Inverse Probability of Treatment Weighting (IPTW) Using the Propensity Score to Estimate Causal Treatment Effects in Observational Studies.” Statistics in Medicine 34 (28): 3661–79. https://doi.org/10.1002/sim.6607.Search in Google Scholar PubMed PubMed Central

Beneventano, P., P. D. Berger, and B. D. Weinberg. 2012. “Predicting Run Production and Run Prevention in Baseball: The Impact of Sabermetrics.” International Journal of Business, Humanities and Technology 2 (4): 67–75.Search in Google Scholar

Bock, J. R. 2015. “Pitch Sequence Complexity and Long-Term Pitcher Performance.” Sports 3 (1): 40–55. https://doi.org/10.3390/sports3010040.Search in Google Scholar

Bock, J. R., A. Maewal, and D. A. Gough. 2012. “Hitting is Contagious in Baseball: Evidence from Long Hitting Streaks.” PLoS One 7 (12): e51367. https://doi.org/10.1371/journal.pone.0051367.Search in Google Scholar PubMed PubMed Central

Cannas, M., and B. Arpino. 2019. “A Comparison of Machine Learning Algorithms and Covariate Balance Measures for Propensity Score Matching and Weighting.” Biometrical Journal 61 (4): 1049–72. https://doi.org/10.1002/bimj.201800132.Search in Google Scholar PubMed

Click, J., and J. Keri. 2006. Baseball Between the Numbers: Why Everything You Know about the Game is Wrong. New York: Perseus Books Group.Search in Google Scholar

Cole, S. R., and M. A. Hernán. 2008. “Constructing Inverse Probability Weights for Marginal Structural Models.” American Journal of Epidemiology 168 (6): 656–64. https://doi.org/10.1093/aje/kwn164.Search in Google Scholar PubMed PubMed Central

Costa, G. B., M. R. Huber, and J. T. Saccoman. 2012. Reasoning with Sabermetrics: Applying Statistical Science to Baseball’s Tough Questions. Jefferson, NC: McFarland.Search in Google Scholar

Damluji, A. A., K. Bandeen-Roche, C. Berkower, C. M. Boyd, M. S. Al-Damluji, M. G. Cohen, D. E. Forman, R. Chaudhary, G. Gerstenblith, J. D. Walston, J. R. Resar, and M. Moscucci. 2019. “Percutaneous Coronary Intervention in Older Patients with St-Segment Elevation Myocardial Infarction and Cardiogenic Shock.” Journal of the American College of Cardiology 73 (15): 1890–900. https://doi.org/10.1016/j.jacc.2019.01.055.Search in Google Scholar PubMed PubMed Central

Davies, M. A., and D. Basco. 2010. “The Many Flavors of Dips: A History and an Overview.” Baseball Research Journal 39 (2): 41–50.Search in Google Scholar

Fujii, K., K. Takeuchi, A. Kuribayashi, N. Takeishi, Y. Kawahara, and K. Takeda. 2022. “Estimating Counterfactual Treatment Outcomes over Time in Complex Multi-Agent Scenarios.” arXiv preprint arXiv:2206.01900.10.1145/3557915.3560941Search in Google Scholar

Gibbs, C., R. Elmore, and B. Fosdick. 2020. “The Causal Effect of a Timeout at Stopping an Opposing Run in the Nba.” arXiv preprint arXiv:2011.11691.Search in Google Scholar

Gray, R. 2002. “Behavior of College Baseball Players in a Virtual Batting Task.” Journal of Experimental Psychology: Human Perception and Performance 28 (5): 1131. https://doi.org/10.1037/0096-1523.28.5.1131.Search in Google Scholar

Harrison, W. K., and J. L. Salmon. 2019. “Leveraging Pitcher-Batter Matchups for Optimal Game Strategy.” In MIT Sloan Sports Analytics Conference. Also available at https://www.sloansportsconference.com/research-papers/leveraging-batter-pitcher-matchups-for-optimal-game-strategy.Search in Google Scholar

Haukoos, J. S., and R. J. Lewis. 2005. “Advanced Statistics: Bootstrapping Confidence Intervals for Statistics with “Difficult” Distributions.” Academic Emergency Medicine 12 (4): 360–5. https://doi.org/10.1197/j.aem.2004.11.018.Search in Google Scholar PubMed

Healey, G. 2015. “Modeling the Probability of a Strikeout for a Batter/Pitcher Matchup.” IEEE Transactions on Knowledge and Data Engineering 27 (9): 2415–23. https://doi.org/10.1109/tkde.2015.2416735.Search in Google Scholar

Herrlin, D. L. 2015. Forecasting MLB Performance Utilizing a Bayesian Approach in Order to Optimize a Fantasy Baseball Draft. Also available at https://digitallibrary.sdsu.edu/islandora/object/sdsu%3A2238.Search in Google Scholar

Hoang, P., M. Hamilton, J. Murray, C. Stafford, and H. Tran. 2015. “A Dynamic Feature Selection Based Lda Approach to Baseball Pitch Prediction.” In Trends and Applications in Knowledge Discovery and Data Mining, 125–37. Springer. Also available at https://link.springer.com/chapter/10.1007/978-3-319-25660-3_11.10.1007/978-3-319-25660-3_11Search in Google Scholar

Holland, P. W. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81 (396): 945–60. https://doi.org/10.1080/01621459.1986.10478354.Search in Google Scholar

James, B. 2010. The New Bill James Historical Baseball Abstract. New York: Simon & Schuster.Search in Google Scholar

Koseler, K., and M. Stephan. 2017. “Machine Learning Applications in Baseball: A Systematic Literature Review.” Applied Artificial Intelligence 31 (9–10): 745–63.10.1080/08839514.2018.1442991Search in Google Scholar

Künzel, S. R., J. S. Sekhon, P. J. Bickel, and B. Yu. 2019. “Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning.” Proceedings of the National Academy of Sciences 116 (10): 4156–65. https://doi.org/10.1073/pnas.1804597116.Search in Google Scholar PubMed PubMed Central

Lewis, M. 2004. Moneyball: The Art of Winning an Unfair Game. New York: WW Norton & Company.Search in Google Scholar

Lundberg, S. M., and S.-I. Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” arXiv preprint arXiv:1705.07874.Search in Google Scholar

Martin, E. P. 2019. “Predicting Major League Baseball Strikeout Rates from Differences in Velocity and Movement Among Player Pitch Types.” In MIT Sloan Sports Analytics Conference. Also available at https://www.sloansportsconference.com/research-papers/predicting-major-league-baseball-strikeout-rates-from-differences-in-velocity-and-movement-among-player-pitch-types.Search in Google Scholar

MLB.com. 2015a. Expected Era (xera). Also available at https://www.mlb.com/glossary/statcast/expected-era (accessed July 5, 2021).Search in Google Scholar

MLB.com. 2015b. Expected Weighted On-Base Average (xwoba). Also available at https://www.mlb.com/glossary/statcast/expected-woba (accessed July 5, 2021).Search in Google Scholar

MLB.com. 2017. “Air Ball Revolution” Rewards Hard Elevation. Also available at https://www.mlb.com/news/mlb-air-ball-revolution-requires-hard-hit-ball-c234596050.Search in Google Scholar

Nakahara, H., K. Takeda, and K. Fujii. 2022a. “Estimating the Effect of Team Hitting Strategies Using Counterfactual Virtual Simulation in Baseball.” International Journal of Computer Science in Sport 22 (1): 1–12. https://doi.org/10.2478/ijcss-2023-0001.Search in Google Scholar

Nakahara, H., K. Takeda, and K. Fujii. 2022b. Evaluating a Third Base Coach’s Decision Making via Game Theory and Machine Learning. Reading, UK: University of Reading, MathSport International.Search in Google Scholar

Robins, J. M., M. A. Hernan, and B. Brumback. 2000. Marginal Structural Models and Causal inference in Epidemiology. Also available at https://journals.lww.com/epidem/fulltext/2000/09000/marginal_structural_models_and_causal_inference_in.11.aspx.10.1097/00001648-200009000-00011Search in Google Scholar PubMed

Rosenbaum, P. R., and D. B. Rubin. 1985. “Constructing a Control Group Using Multivariate Matched Sampling Methods that Incorporate the Propensity Score.” The American Statistician 39 (1): 33–8. https://doi.org/10.1080/00031305.1985.10479383.Search in Google Scholar

Rubin, D. B. 1997. “Estimating Causal Effects from Large Data Sets Using Propensity Scores.” Annals of Internal Medicine 127 (8_Part_2): 757–63. https://doi.org/10.7326/0003-4819-127-8_part_2-199710151-00064.Search in Google Scholar PubMed

Shinya, M., S. Tsuchiya, Y. Yamada, K. Nakazawa, K. Kudo, and S. Oda. 2017. “Pitching Form Determines Probabilistic Structure of Errors in Pitch Location.” Journal of Sports Sciences 35 (21): 2142–7. https://doi.org/10.1080/02640414.2016.1258484.Search in Google Scholar PubMed

Silver, N. 2003. “Introducing Pecota.” Baseball Prospectus (at Columbus, Ohio) 2003: 507–14.Search in Google Scholar

Sawchik, T. 2017a. Can more MLB Hitters Get off the Ground? Also available at https://blogs.fangraphs.com/can-more-mlb-hitters-get-off-the-ground/.Search in Google Scholar

Sawchik, T. 2017b. Has the Fly-Ball Revolution Begun? Also available at https://blogs.fangraphs.com/has-the-fly-ball-revolution-begun/.Search in Google Scholar

Sheinin, D. 2017. These Days in Baseball, Every Batter is Trying to Find an Angle. Also available at https://www.washingtonpost.com/graphics/sports/mlb-launch-angles-story/?utm_term=.132ca7d69bad.Search in Google Scholar

Tango, T. M., M. G. Lichtman, and A. E. Dolphin. 2007. The Book: Playing the Percentages in Baseball. Sterling, VA: Potomac Books, Inc.Search in Google Scholar

Thorn, J., P. Palmer, and D. Reuther. 2015. The Hidden Game of Baseball: A Revolutionary Approach to Baseball and its Statistics. Chicago, IL: University of Chicago Press.10.7208/chicago/9780226276830.001.0001Search in Google Scholar

Toda, K., M. Teranishi, K. Kushiro, and K. Fujii. 2022. “Evaluation of Soccer Team Defense Based on Prediction Models of Ball Recovery and Being Attacked: A Pilot Study.” PLoS One 17 (1): e0263051. https://doi.org/10.1371/journal.pone.0263051.Search in Google Scholar PubMed PubMed Central

Toumi, A., and M. Lopez. 2019. “From Grapes and Prunes to Apples and Apples: Using Matched Methods to Estimate Optimal Zone Entry Decision-Making in the National Hockey League.” In Carnegie Mellon Sports Analytics Conference 2019.Search in Google Scholar

Vock, D. M., and L. F. B. Vock. 2018. “Estimating the Effect of Plate Discipline Using a Causal Inference Framework: An Application of the G-Computation Algorithm.” Journal of Quantitative Analysis in Sports 14 (2): 37–56.10.1515/jqas-2016-0029Search in Google Scholar

Woolner, K. 2002. Aim for the Head: Simulating Catcher’s Era. Also available at http://www.soumu.go.jp/menu_news/s-news/01tsushin02_02000072.html.Search in Google Scholar

Wu, L. Y., A. J. Danielson, X. J. Hu, and T. B. Swartz. 2021. “A Contextual Analysis of Crossing the Ball in Soccer.” Journal of Quantitative Analysis in Sports 17 (1): 57–66.10.1515/jqas-2020-0060Search in Google Scholar

Yam, D. R., and M. J. Lopez. 2019. “What Was Lost? A Causal Estimate of Fourth Down Behavior in the National Football League.” Journal of Sports Analytics 5 (3): 153–67. https://doi.org/10.3233/jsa-190294.Search in Google Scholar

Received: 2021-07-05
Accepted: 2023-03-30
Published Online: 2023-05-18
Published in Print: 2023-06-27

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 4.5.2024 from https://www.degruyter.com/document/doi/10.1515/jqas-2021-0060/html
Scroll to top button