Skip to main content

Advertisement

Log in

Modeling and forecasting rainfall patterns in India: a time series analysis with XGBoost algorithm

  • Original Article
  • Published:
Environmental Earth Sciences Aims and scope Submit manuscript

Abstract

This study utilizes time series analysis and machine learning techniques to model and forecast rainfall patterns across different seasons in India. The statistical models, i.e., autoregressive integrated moving average (ARIMA) and state space model and machine learning models, i.e., Support Vector Machine, Artificial Neural Network and Random Forest Model were developed and their performance was compared against XGBoost, an advanced machine learning algorithm, using training and testing datasets. The results demonstrate the superior accuracy of XGBoost compared to the statistical models in capturing complex nonlinear rainfall patterns. While ARIMA models tend to overfit the training data, state space models prove more robust to outliers in the testing set. Diagnostic checks show the models adequately capture the time series properties. The analysis indicates essential unchanging rainfall patterns in India for 2023–2027, with implications for water resource management and climate-sensitive sectors like agriculture and power generation. Overall, the study highlights the efficacy of modern machine learning approaches like XGBoost for forecasting complex meteorological time series. The framework presented enables rigorous validation and selection of optimal techniques. Further applications of such sophisticated data analysis can significantly enhance planning and research on the Indian monsoons amidst climate change challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Availability of data and materials

The data used to support the findings of this study are available from the corresponding author upon request.

References

  • Al Khatib AMG, Yonar H, Abotaleb M, Mishra P, Yonar A, Karakaya K, Badr A, Dhaka V (2021) Modeling and forecasting of egg production in India using time series models. Eurasian J Vet Sci 37(4):265–273

    Article  Google Scholar 

  • Ali S, Shahbaz M (2020) Streamflow forecasting by modeling the rainfall–streamflow relationship using artificial neural networks. Model Earth Syst Environ 6:1645–1656

    Article  Google Scholar 

  • Desai VP, Kamat RK, Oza KS (2022) Rainfall modeling and prediction using neural networks: a case study of Maharashtra. Disaster Adv 15:39–43

    Article  Google Scholar 

  • Dutta PS, Tahbider H (2014) Prediction of rainfall using data mining technique over Assam. Indian J Comput Sci Eng 5:85–90

    Google Scholar 

  • Elbeltagi A, Srivastava A, Li P, Jiang J, Jinsong D, Rajput J, Khadke L, Awad A (2023) Forecasting actual evapotranspiration without climatic data based on stacked integration of DNN and meta-heuristic models across China from 1958 to 2021. J Environ Manage 345:118697. https://doi.org/10.1016/j.jenvman.2023.118697

    Article  PubMed  Google Scholar 

  • Gil-Alana LA, Cunado J, Perez de Gracia F (2008) Tourism in the Canary Islands: forecasting using several seasonal time series models. J Forecast 27(7):621–636

    Article  MathSciNet  Google Scholar 

  • Guhathakurta P, Rajeevan M (2008) Trends in the rainfall pattern over India. Intern J Climatol 28:1453–1469. https://doi.org/10.1002/joc.1640

    Article  Google Scholar 

  • Hooda E, Verma U, Hooda BK (2020) ARIMA and State-Space models for sugarcane (Saccharum officinarum) yield forecasting in Northern agro-climatic zone of Haryana. J Appl Nat Sci 12:53–58. https://doi.org/10.3101/jans.v12i1.2229

    Article  Google Scholar 

  • Johny K, Pai ML, Adarsh S (2020) Adaptive EEMD-ANN hybrid model for Indian summer monsoon rainfall forecasting. Theoret Appl Climatol 141:1–17. https://doi.org/10.1007/s00704-020-03177-5

    Article  Google Scholar 

  • Joshi MK, Pandey AC (2011) Trend and spectral analysis of rainfall over India during 1901–2000. J Geophys Res Atmos 116:1–13. https://doi.org/10.1029/2010JD014966

    Article  Google Scholar 

  • Joshi H, Tyagi D (2021) Forecasting and modeling monthly rainfall in Bengaluru, India: an application of time series models. Int J Sci Res Math Stat Sci 8(1):39–46

    Google Scholar 

  • Lama A, Singh KN, Singh H, Shekhawat R, Mishra P, Gurung B (2022) Forecasting monthly rainfall of Sub-Himalayan region of India using parametric and non-parametric modelling approaches. Model Earth Syst Environ 8:837–845. https://doi.org/10.1007/s40808-021-01124-5

    Article  Google Scholar 

  • Liyew CM, Melese HA (2021) Machine learning techniques to predict daily rainfall amount. J Big Data 8:153. https://doi.org/10.1186/s40537-021-00545-4

    Article  Google Scholar 

  • Luk KC, Ball JE, Sharma A (2001) An application of artificial neural networks for rainfall forecasting. Math Commun Model 33:683–693

    Article  Google Scholar 

  • Mishra P, Al Khatib AMG, Sardar I, Mohammed J, Ray M, Manish K et al (2020) Modelling and forecasting of COVID-19 in India. J Infect Dis Epidemiol 6(5):1–11

    CAS  Google Scholar 

  • Mishra P, Al Khatib AMG, Sardar I, Mohammed J, Karakaya K, Dash A et al (2021) Modeling and forecasting of sugarcane production in India. Sugar Tech 23(6):1317–1324

    Article  Google Scholar 

  • Mishra P, Alakkari KM, Lama A, Ray S, Singh M, Shoko C et al (2023) Modeling and forecasting of sugarcane production in South Asian countries. Curr Appl Sci Technol 23(1):1–15. https://doi.org/10.55003/cast.2022.01.23.002

    Article  Google Scholar 

  • Navone HD, Ceccatto HA (1994) Predicting Indian monsoon rainfall: a neural network approach. Clim Dyn 10:305–312

    Article  Google Scholar 

  • Niranjan HK, Kumari B, Raghav YS, Mishra P, Al Khatib AMG, Abotaleb M (2022) Modeling and forecasting of tea production in India. J Anim Plant Sci 32(6):1598–1604

    Google Scholar 

  • Nwokike CC, Offorha BC, Obubu M, Ugoala CB, Ukomsh HI (2020) Comparing SANN and SARIMA for forecasting frequency of monthly rainfall in Umuahia. Sci Afr 10:e00621. https://doi.org/10.1016/j.sciaf.2020.e00621

    Article  Google Scholar 

  • Pal S, Mazumdar D (2018) Forecasting monthly rainfall using artificial neural network. Rashi 3:65–73

    Google Scholar 

  • Paul RK, Yeasin Md (2022) COVID-19 and prices of pulses in major markets of India: impact of nationwide lockdown. PLoS ONE 17(8):e0272999

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Praveen B, Talukdar S, Shahfahad MS, Mondal J, Sharma P, Islam ARMdT, Rahman A (2020) Analyzing trend and forecasting of rainfall changes in India using non-parametrical and machine learning approaches. Sci Rep 10:10342

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Raghav YS, Mishra P, Alakkari KM, Singh M, Al Khatib AMG, Balloo R (2022) Modelling and forecasting of pulses production in south Asian countries and its role in nutritional security. Legume Res Int J 45(4):454–461

    Google Scholar 

  • Rahman UH, Ray S, Al Khatib AMG, Lal P, Mishra P, Fatih C et al (2022) State of art of SARIMA model in second wave on COVID-19 in India. Int J Agric Stat Sci 18(1):141–152

    Google Scholar 

  • Rawat D, Mishra P, Ray S, Warnakulasooriya HHF, Sati SP, Mishra G, Alkattan H, Abotaleb M (2022) Modeling of rainfall time series using NAR and ARIMA model over western Himalaya, India. Arab J Geosci 15:1696. https://doi.org/10.1007/s12517-022-10994-7

    Article  Google Scholar 

  • Ray S, Das SS, Mishra P, Al Khatib AMG (2021) Time series SARIMA modelling and forecasting of monthly rainfall and temperature in the South Asian countries. Earth Syst Environ 5:531–546. https://doi.org/10.1007/s41748-021-00205-w

    Article  Google Scholar 

  • Ray M, Sahoo KC, Abotaleb M, Ray S, Sahu PK, Mishra P, Al Khatib AMG, Das SS, Jain V, Balloo R (2022) Modeling and forecasting meteorological factors using BATS and TBATS models for the Keonjhar district of Orissa. Mausam 73:555–564. https://doi.org/10.54302/mausam.v73i3.1480

    Article  Google Scholar 

  • Ray S, Lama A, Mishra P, Biswas T, Das SS, Gurung B (2023) An ARIMA-LSTM model for predicting volatile agricultural price series with random forest technique. Appl Soft Comput J 149:110939. https://doi.org/10.1016/j.asoc.2023.110939

    Article  Google Scholar 

  • Sahoo A, Samantaray S, Ghose DK (2019) Stream flow forecasting in Mahanadi river basin using artificial neural networks. Procedia Comput Sci 157:168–174

    Article  Google Scholar 

  • Unnikrishnan P, Jothiprakash V (2020) Hybrid SSA-ARIMA-ANN model for forecasting daily rainfall. Water Resour Manag 34:3609–3623. https://doi.org/10.1007/s11269-020-02638-w

    Article  Google Scholar 

  • Virmani A (2006) India’s economic growth history: fluctuations, trends, break points and phases. Indian Econ Rev 41:81–103

    Google Scholar 

  • Yadav S, Mishra P, Kumari B, Shah IA, Karakaya K, Shrivastri S et al (2022) Modelling and forecasting of maize production in South Asian countries. Econ Aff 67(4):519–531

    Google Scholar 

  • Yonar H, Yonar A, Mishra P, Abotaleb M, Al Khatib AMG, Makarovskikh T, Cam M (2022) Modeling and forecasting of milk production in different breeds in Turkey. Indian J Anim Sci 92:105

    Article  Google Scholar 

  • Beņkovskis K (2008) Short-term forecasts of Latvia's real gross domestic product growth using monthly indicators

  • Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD '16). Association for Computing Machinery, New York, pp 785–794. https://doi.org/10.1145/2939672.2939785

  • Soto-Ferrari M (2020) A time-series forecasting performance comparison for neural networks with state space and ARIMA models. In: Proceedings of the 5th N.A. international conference on industrial engineering and operations management Detroit, Michigan, USA

  • Swain S, Nandi S, Patel P (2018) Development of an ARIMA model for monthly rainfall forecasting over Khordha District, Odisha, India, In: Sa P, Bakshi S, Hatzilygeroudis I, Sahoo M (eds) Recent findings in intelligent computing techniques. Advances in intelligent systems and computing, vol 708. Springer, Singapore. https://doi.org/10.1007/978-981-10-8636-6_34

Download references

Funding

No Funding.

Author information

Authors and Affiliations

Authors

Contributions

All authors equally contributed to the research and to preparing the manuscript. The authors have read and agreed to the submitted version of the manuscript.

Corresponding author

Correspondence to Soumik Ray.

Ethics declarations

Conflict of interest

All authors certify that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mishra, P., Al Khatib, A.M.G., Yadav, S. et al. Modeling and forecasting rainfall patterns in India: a time series analysis with XGBoost algorithm. Environ Earth Sci 83, 163 (2024). https://doi.org/10.1007/s12665-024-11481-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12665-024-11481-w

Keywords

Navigation