Abstract
This study utilizes time series analysis and machine learning techniques to model and forecast rainfall patterns across different seasons in India. The statistical models, i.e., autoregressive integrated moving average (ARIMA) and state space model and machine learning models, i.e., Support Vector Machine, Artificial Neural Network and Random Forest Model were developed and their performance was compared against XGBoost, an advanced machine learning algorithm, using training and testing datasets. The results demonstrate the superior accuracy of XGBoost compared to the statistical models in capturing complex nonlinear rainfall patterns. While ARIMA models tend to overfit the training data, state space models prove more robust to outliers in the testing set. Diagnostic checks show the models adequately capture the time series properties. The analysis indicates essential unchanging rainfall patterns in India for 2023–2027, with implications for water resource management and climate-sensitive sectors like agriculture and power generation. Overall, the study highlights the efficacy of modern machine learning approaches like XGBoost for forecasting complex meteorological time series. The framework presented enables rigorous validation and selection of optimal techniques. Further applications of such sophisticated data analysis can significantly enhance planning and research on the Indian monsoons amidst climate change challenges.
Similar content being viewed by others
Availability of data and materials
The data used to support the findings of this study are available from the corresponding author upon request.
References
Al Khatib AMG, Yonar H, Abotaleb M, Mishra P, Yonar A, Karakaya K, Badr A, Dhaka V (2021) Modeling and forecasting of egg production in India using time series models. Eurasian J Vet Sci 37(4):265–273
Ali S, Shahbaz M (2020) Streamflow forecasting by modeling the rainfall–streamflow relationship using artificial neural networks. Model Earth Syst Environ 6:1645–1656
Desai VP, Kamat RK, Oza KS (2022) Rainfall modeling and prediction using neural networks: a case study of Maharashtra. Disaster Adv 15:39–43
Dutta PS, Tahbider H (2014) Prediction of rainfall using data mining technique over Assam. Indian J Comput Sci Eng 5:85–90
Elbeltagi A, Srivastava A, Li P, Jiang J, Jinsong D, Rajput J, Khadke L, Awad A (2023) Forecasting actual evapotranspiration without climatic data based on stacked integration of DNN and meta-heuristic models across China from 1958 to 2021. J Environ Manage 345:118697. https://doi.org/10.1016/j.jenvman.2023.118697
Gil-Alana LA, Cunado J, Perez de Gracia F (2008) Tourism in the Canary Islands: forecasting using several seasonal time series models. J Forecast 27(7):621–636
Guhathakurta P, Rajeevan M (2008) Trends in the rainfall pattern over India. Intern J Climatol 28:1453–1469. https://doi.org/10.1002/joc.1640
Hooda E, Verma U, Hooda BK (2020) ARIMA and State-Space models for sugarcane (Saccharum officinarum) yield forecasting in Northern agro-climatic zone of Haryana. J Appl Nat Sci 12:53–58. https://doi.org/10.3101/jans.v12i1.2229
Johny K, Pai ML, Adarsh S (2020) Adaptive EEMD-ANN hybrid model for Indian summer monsoon rainfall forecasting. Theoret Appl Climatol 141:1–17. https://doi.org/10.1007/s00704-020-03177-5
Joshi MK, Pandey AC (2011) Trend and spectral analysis of rainfall over India during 1901–2000. J Geophys Res Atmos 116:1–13. https://doi.org/10.1029/2010JD014966
Joshi H, Tyagi D (2021) Forecasting and modeling monthly rainfall in Bengaluru, India: an application of time series models. Int J Sci Res Math Stat Sci 8(1):39–46
Lama A, Singh KN, Singh H, Shekhawat R, Mishra P, Gurung B (2022) Forecasting monthly rainfall of Sub-Himalayan region of India using parametric and non-parametric modelling approaches. Model Earth Syst Environ 8:837–845. https://doi.org/10.1007/s40808-021-01124-5
Liyew CM, Melese HA (2021) Machine learning techniques to predict daily rainfall amount. J Big Data 8:153. https://doi.org/10.1186/s40537-021-00545-4
Luk KC, Ball JE, Sharma A (2001) An application of artificial neural networks for rainfall forecasting. Math Commun Model 33:683–693
Mishra P, Al Khatib AMG, Sardar I, Mohammed J, Ray M, Manish K et al (2020) Modelling and forecasting of COVID-19 in India. J Infect Dis Epidemiol 6(5):1–11
Mishra P, Al Khatib AMG, Sardar I, Mohammed J, Karakaya K, Dash A et al (2021) Modeling and forecasting of sugarcane production in India. Sugar Tech 23(6):1317–1324
Mishra P, Alakkari KM, Lama A, Ray S, Singh M, Shoko C et al (2023) Modeling and forecasting of sugarcane production in South Asian countries. Curr Appl Sci Technol 23(1):1–15. https://doi.org/10.55003/cast.2022.01.23.002
Navone HD, Ceccatto HA (1994) Predicting Indian monsoon rainfall: a neural network approach. Clim Dyn 10:305–312
Niranjan HK, Kumari B, Raghav YS, Mishra P, Al Khatib AMG, Abotaleb M (2022) Modeling and forecasting of tea production in India. J Anim Plant Sci 32(6):1598–1604
Nwokike CC, Offorha BC, Obubu M, Ugoala CB, Ukomsh HI (2020) Comparing SANN and SARIMA for forecasting frequency of monthly rainfall in Umuahia. Sci Afr 10:e00621. https://doi.org/10.1016/j.sciaf.2020.e00621
Pal S, Mazumdar D (2018) Forecasting monthly rainfall using artificial neural network. Rashi 3:65–73
Paul RK, Yeasin Md (2022) COVID-19 and prices of pulses in major markets of India: impact of nationwide lockdown. PLoS ONE 17(8):e0272999
Praveen B, Talukdar S, Shahfahad MS, Mondal J, Sharma P, Islam ARMdT, Rahman A (2020) Analyzing trend and forecasting of rainfall changes in India using non-parametrical and machine learning approaches. Sci Rep 10:10342
Raghav YS, Mishra P, Alakkari KM, Singh M, Al Khatib AMG, Balloo R (2022) Modelling and forecasting of pulses production in south Asian countries and its role in nutritional security. Legume Res Int J 45(4):454–461
Rahman UH, Ray S, Al Khatib AMG, Lal P, Mishra P, Fatih C et al (2022) State of art of SARIMA model in second wave on COVID-19 in India. Int J Agric Stat Sci 18(1):141–152
Rawat D, Mishra P, Ray S, Warnakulasooriya HHF, Sati SP, Mishra G, Alkattan H, Abotaleb M (2022) Modeling of rainfall time series using NAR and ARIMA model over western Himalaya, India. Arab J Geosci 15:1696. https://doi.org/10.1007/s12517-022-10994-7
Ray S, Das SS, Mishra P, Al Khatib AMG (2021) Time series SARIMA modelling and forecasting of monthly rainfall and temperature in the South Asian countries. Earth Syst Environ 5:531–546. https://doi.org/10.1007/s41748-021-00205-w
Ray M, Sahoo KC, Abotaleb M, Ray S, Sahu PK, Mishra P, Al Khatib AMG, Das SS, Jain V, Balloo R (2022) Modeling and forecasting meteorological factors using BATS and TBATS models for the Keonjhar district of Orissa. Mausam 73:555–564. https://doi.org/10.54302/mausam.v73i3.1480
Ray S, Lama A, Mishra P, Biswas T, Das SS, Gurung B (2023) An ARIMA-LSTM model for predicting volatile agricultural price series with random forest technique. Appl Soft Comput J 149:110939. https://doi.org/10.1016/j.asoc.2023.110939
Sahoo A, Samantaray S, Ghose DK (2019) Stream flow forecasting in Mahanadi river basin using artificial neural networks. Procedia Comput Sci 157:168–174
Unnikrishnan P, Jothiprakash V (2020) Hybrid SSA-ARIMA-ANN model for forecasting daily rainfall. Water Resour Manag 34:3609–3623. https://doi.org/10.1007/s11269-020-02638-w
Virmani A (2006) India’s economic growth history: fluctuations, trends, break points and phases. Indian Econ Rev 41:81–103
Yadav S, Mishra P, Kumari B, Shah IA, Karakaya K, Shrivastri S et al (2022) Modelling and forecasting of maize production in South Asian countries. Econ Aff 67(4):519–531
Yonar H, Yonar A, Mishra P, Abotaleb M, Al Khatib AMG, Makarovskikh T, Cam M (2022) Modeling and forecasting of milk production in different breeds in Turkey. Indian J Anim Sci 92:105
Beņkovskis K (2008) Short-term forecasts of Latvia's real gross domestic product growth using monthly indicators
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD '16). Association for Computing Machinery, New York, pp 785–794. https://doi.org/10.1145/2939672.2939785
Soto-Ferrari M (2020) A time-series forecasting performance comparison for neural networks with state space and ARIMA models. In: Proceedings of the 5th N.A. international conference on industrial engineering and operations management Detroit, Michigan, USA
Swain S, Nandi S, Patel P (2018) Development of an ARIMA model for monthly rainfall forecasting over Khordha District, Odisha, India, In: Sa P, Bakshi S, Hatzilygeroudis I, Sahoo M (eds) Recent findings in intelligent computing techniques. Advances in intelligent systems and computing, vol 708. Springer, Singapore. https://doi.org/10.1007/978-981-10-8636-6_34
Funding
No Funding.
Author information
Authors and Affiliations
Contributions
All authors equally contributed to the research and to preparing the manuscript. The authors have read and agreed to the submitted version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors certify that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mishra, P., Al Khatib, A.M.G., Yadav, S. et al. Modeling and forecasting rainfall patterns in India: a time series analysis with XGBoost algorithm. Environ Earth Sci 83, 163 (2024). https://doi.org/10.1007/s12665-024-11481-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-024-11481-w