Skip to main content
Log in

Evaluating the impact of drift detection mechanisms on stock market forecasting

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The stock market is an important segment of the economy that circulates a large volume of assets. Several factors may affect the stock market transactions, leading to fluctuations in the stock values that may pose a problem for those who seek to forecast future stock values and maximize their profits. This issue is more serious when the stock values present the concept drift phenomenon, which means that the stock value’s patterns change over time. In this work, we aimed to evaluate whether machine learning-based predictors that incorporate mechanisms to deal with concept drift are suitable for stock market forecasting. To do so, a historic database of stock prices of 10 companies, negotiated in the Brazilian stock exchange and collected over 20 years, was used. We compared the performance of predictors based on different paradigms, with and without mechanisms to deal with concept drift, and the results showed that, although the strategies that handle concept drift demand longer computational times, they also tend to present smaller prediction errors. The highlight was the EOS-D approach, which had the best performance in 6 of the 10 stocks analyzed considering one-to-one comparisons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

The data sets generated and/or analyzed during the current study, together with the source codes (in Python), are available in the Repositório de Dados de Pesquisa da Unicamp (REDU), https://doi.org/10.25824/redu/WB2R60.

Notes

  1. Available at: https://doi.org/10.25824/redu/WB2R60.

Abbreviations

[B]\({}^{3}\) :

BRASIL BOLSA BALCÃO (main Brazilian stock exchange)

ABEV3:

Ticker for Ambev

ADWIN:

Adaptive windowing

ANN:

Artificial neural network

BBAS3:

Ticker for Banco do Brasil S.A.

BBDC3:

Ticker for Banco Bradesco S.A.

DDM:

Drift detection mechanism

DOER:

Dynamic and online ensemble for regression

DOER-RANK:

A variation of the DOER algorithm that incorporates component ranking

DS:

Directional symmetry

ECDD:

Exponentially weighted moving average for concept drift

ELM:

Extreme learning machine

EOS:

Ensemble of online learners with substitution of models

EOS-D:

A variation of the EOS algorithm that incorporates dynamic adaptations

EOS-RANK:

A variation of the EOS algorithm that incorporates component ranking

GGBR3:

Ticker for Gerdau

HDDM_A:

Drift detection method based on Hoeffding’s bounds using moving average

HDDM_W:

Drift detection method based on Hoeffding’s bounds using weighted moving average

ITUB3:

Ticker for Itaú Unibanco

KNN:

K-Nearest neighbor

LAME3:

Ticker for Lojas Americanas S.A.

MAE:

Mean absolute error

MAPE:

Mean absolute percentage error

MSE:

Mean squared error

MOA:

Massive online analysis

OS-ELM:

Online sequential extreme learning machine

PETR3:

Ticker for Petróleo Brasileiro S.A. - Petrobras

PHT:

Page-Hinkley test

\(PM_{2.5}\) :

Particulate matter 2.5

RADL3:

Ticker for Raia Drogasil

RDDM:

Reactive drift detection method

RF:

Random forests

RF-A:

A variation of the RF algorithm that retrains the model with each new sample

RMS:

Root mean square

RMSE:

Root mean square deviation

SMAPE:

Symmetric mean absolute percentage error

STEPD:

Statistical test of equal proportions

SVM:

Support vector machine

SVR:

Support vector regression

VALE3:

Ticker for Vale S.A

WDS:

Weighted directional symmetry

WEGE3:

Ticker for Weg S.A

References

  1. Naiara B VI (2020) B3 fala de mudanças feitas para embarcar os milhões de novos investidores de 2020. Last accessed 15 November 2022. https://valorinveste.globo.com/objetivo/hora-de-investir/noticia/2020/12/09/b3-fala-de-mudancas-feitas-para-embarcar-os-milhoes-de-novos-investidores-de-2020.ghtml

  2. Rezende TM (2019) A meritocracia no mercado financeiro brasileiro. Master’s thesis, Fundação Getúlio Vargas

  3. Nelson DMQ (2017) Uso de redes neurais recorrentes para previsão de séries temporais financeiras. Master’s thesis, Universidade Federal de Minas Gerais

  4. Miranda AN (2019) Simcomben: combinando predições para séries financeiras similares a fim de prever a direção do movimento de preços de ações. Master’s thesis, Universidade Federal do Rio Grande do Sul

  5. Fama EF (1991) Efficient capital markets: Ii. J Finance 46(5):1575–1617

    Article  Google Scholar 

  6. Atsalakis GS, Valavanis KP (2009) Surveying stock market forecasting techniques-part ii: soft computing methods. Expert Syst Appl 36(3):5932–5941

    Article  Google Scholar 

  7. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Waltham, MA

    Google Scholar 

  8. Lee M-C (2009) Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst Appl 36(8):10896–10904

    Article  Google Scholar 

  9. Dongre PB, Malik LG (2014) A review on real time data stream classification and adapting to various concept drift scenarios. In: 2014 IEEE International advance computing conference (IACC), IEEE pp. 533–537

  10. Harries M, Horn K (1995) Detecting concept drift in financial time series prediction using symbolic machine learning. In: AI-CONFERENCE-, pp. 91–98. Citeseer

  11. Pinage FA, dos Santos EM (2015) A dissimilarity-based drift detection method. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), IEEE, pp. 1069–1076

  12. Lima M, Neto M, Silva Filho T, Fagundes RAdA (2022) Learning under concept drift for regression-a systematic literature review. IEEE Access 10:45410–45429

    Article  Google Scholar 

  13. Oliveira GH, Cavalcante RC, Cabral GG, Minku LL, Oliveira AL (2017) Time series forecasting in the presence of concept drift: a pso-based approach. In: 2017 IEEE 29th International conference on tools with artificial intelligence (ICTAI), IEEE, pp. 239–246

  14. Cavalcante RC, Brasileiro RC, Souza VL, Nobrega JP, Oliveira AL (2016) Computational intelligence and financial markets: a survey and future directions. Expert Syst Appl 55:194–211

    Article  Google Scholar 

  15. Shen S, Jiang H, Zhang T (2012) Stock market forecasting using machine learning algorithms. Department of Electrical Engineering, Stanford University, Stanford, CA, pp. 1–5

  16. Atsalakis G, Valavanis KP (2010) Surveying stock market forecasting techniques-part i: conventional methods. J Comput Optimiz Econ Finance 2(1):45–92

    Google Scholar 

  17. Kumar G, Jain S, Singh UP (2021) Stock market forecasting using computational intelligence: a survey. Arch Computat Methods Eng 28:1069–1101

    Article  MathSciNet  Google Scholar 

  18. Rajput V, Bobde S (2016) Stock market forecasting techniques: literature survey. Int J Comput Sci Mob Comput 5(6):500–506

    Google Scholar 

  19. Bao Y, Lu Y, Zhang J (2004) Forecasting stock price by svms regression. In: International conference on artificial intelligence: methodology, systems, and applications, pp. 295–303. Springer

  20. Alkhatib K, Najadat H, Hmeidi I, Shatnawi MKA (2013) Stock price prediction using k-nearest neighbor (knn) algorithm. Int J Bus, Humanit Technol 3(3):32–44

    Google Scholar 

  21. Vijh M, Chandola D, Tikkiwal VA, Kumar A (2020) Stock closing price prediction using machine learning techniques. Procedia Comput Sci 167:599–606

    Article  Google Scholar 

  22. Kompella S, Chakravarthy Chilukuri K (2020) Stock market prediction using machine learning methods. Int J Comput Eng Technol 10(3):2019

    Google Scholar 

  23. Hu Y, Liu K, Zhang X, Xie K, Chen W, Zeng Y, Liu M (2015) Concept drift mining of portfolio selection factors in stock market. Electron Commer Res Appl 14(6):444–455

    Article  Google Scholar 

  24. Cavalcante RC, Oliveira AL (2015) An approach to handle concept drift in financial time series based on extreme learning machines and explicit drift detection. In: 2015 International joint conference on neural networks (IJCNN), IEEE, pp. 1–8

  25. Neri F (2021) Domain specific concept drift detectors for predicting financial time series. arXiv preprint arXiv:2103.14079

  26. Zheng W, Zhao P, Chen G, Zhou H, Tian Y (2022) A hybrid spiking neurons embedded LSTM network for multivariate time series learning under concept-drift environment. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2022.3178176

    Article  Google Scholar 

  27. Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):1–37

    Article  Google Scholar 

  28. Bueno A, Coelho GP, Bertini Junior JR (2020) Dynamic ensemble mechanisms to improve particulate matter forecasting. Appl Soft Comput 91:106123

    Article  Google Scholar 

  29. Liu Z, Loo CK, Seera M (2019) Meta-cognitive recurrent recursive kernel os-elm for concept drift handling. Appl Soft Comput 75:494–507

    Article  Google Scholar 

  30. de Barros RSM, Santos SGTdC (2019) An overview and comprehensive comparison of ensembles for concept drift. Inform Fusion 52:213–244

    Article  Google Scholar 

  31. Cavalcante RC, Minku LL, Oliveira AL (2016) Fedd: feature extraction for explicit concept drift detection in time series. In: 2016 International joint conference on neural networks (IJCNN). IEEE, pp. 740–747

  32. Žliobaitė I (2010) Learning under concept drift: an overview. arXiv preprint arXiv:1010.4784

  33. Žliobaitė I, Pechenizkiy M, Gama J (2016) An overview of concept drift applications. In: Big data analysis: new algorithms for a new society, Springer, Berlin, pp. 91–114

  34. Iwashita AS, Papa JP (2018) An overview on concept drift learning. IEEE Access 7:1532–1547

    Article  Google Scholar 

  35. Finanças Y (2022) Yahoo Finanças - Mercado de ações ao vivo, cotações e notícias de negócios e finanças. https://br.financas.yahoo.com/

  36. Hasan MK, Alam MA, Das D, Hossain E, Hasan M (2020) Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access 8:76516–76531

    Article  Google Scholar 

  37. Chaurasia V, Pal S (2020) Applications of machine learning techniques to predict diagnostic breast cancer. SN Computer Sci 1(5):1–11

    Article  Google Scholar 

  38. Ogasawara E, Martinez LC, De Oliveira D, Zimbrão G, Pappa GL, Mattoso M (2010) Adaptive normalization: a novel data normalization approach for non-stationary time series. In: The 2010 international joint conference on neural networks (IJCNN), IEEE, pp. 1–8

  39. Gupta V, Hewett R (2019) Adaptive normalization in streaming data. In: Proceedings of the 2019 3rd international conference on big data research, pp. 12–17

  40. Shynkevich Y, McGinnity TM, Coleman S, Li Y, Belatreche A (2014) Forecasting stock price directional movements using technical indicators: investigating window size effects on one-step-ahead forecasting. In: 2014 IEEE conference on computational intelligence for financial engineering & economics (CIFEr). IEEE, pp. 341–348

  41. Goyal R, Chandra P, Singh Y (2014) Suitability of KNN regression in the development of interaction based software fault prediction models. IERI Procedia 6:15–21

    Article  Google Scholar 

  42. Quan Q, Hao Z, Xifeng H, Jingchun L (2020) Research on water temperature prediction based on improved support vector regression. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04836-4

    Article  Google Scholar 

  43. Babar B, Luppino LT, Boström T, Anfinsen SN (2020) Random forest regression for improved mapping of solar irradiance at high latitudes. Sol Energy 198:81–92

    Article  Google Scholar 

  44. Kramer O (2011) Unsupervised k-nearest neighbor regression. arXiv preprint arXiv:1107.3600

  45. ScikitLearn: Sckikit Learn - Neighbors - KNeighborsRegressor (2022). https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html

  46. Smith PF, Ganesh S, Liu P (2013) A comparison of random forest regression and multiple linear regression for prediction in neuroscience. J Neurosci Methods 220(1):85–91

    Article  Google Scholar 

  47. ScikitLearn: Sckikit Learn - Ensemble - RandomForestRegressor. Last accessed 15 November 2022 (2022). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html

  48. Sheta AF, Ahmed SEM, Faris H (2015) A comparison between regression, artificial neural networks and support vector machines for predicting stock market index. Soft Comput 7(8):2

    Google Scholar 

  49. ScikitLearn: Sckikit Learn - SVM - SVR. Last accessed 15 November 2022 (2022). https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html

  50. Ferreira P, Dionísio A, Guedes EF, Zebende GF (2018) A sliding windows approach to analyse the evolution of bank shares in the european union. Physica A 490:1355–1367

    Article  Google Scholar 

  51. Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604

    Google Scholar 

  52. Barros RSM, Santos SGTC (2018) A large-scale comparison of concept drift detectors. Inf Sci 451:348–370

    Article  MathSciNet  Google Scholar 

  53. Gonçalves PM Jr, de Carvalho Santos SG, Barros RS, Vieira DC (2014) A comparative study on concept drift detectors. Expert Syst Appl 41(18):8144–8156

    Article  Google Scholar 

  54. Frías-Blanco I, d. Campo-Ávila J, Ramos-Jiménez G, Morales-Bueno R, Ortiz-Díaz A, Caballero-Mota Y (2015) Online and non-parametric drift detection methods based on hoeffding’s bounds. IEEE Trans Knowl Data Eng 27(3):810–823. https://doi.org/10.1109/TKDE.2014.2345382

    Article  Google Scholar 

  55. Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM international conference on data mining. SIAM, pp. 443–448

  56. Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Brazilian Symposium on Artificial Intelligence. Springer, pp. 286–295

  57. Baena-Garcıa M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: Fourth International workshop on knowledge discovery from data streams, vol. 6, pp. 77–86

  58. Barros RS, Cabral DR, Gonçalves PM Jr, Santos SG (2017) Rddm: reactive drift detection method. Expert Syst Appl 90:344–355

    Article  Google Scholar 

  59. Page ES (1954) Continuous inspection schemes. Biometrika 41(1/2):100–115

    Article  MathSciNet  Google Scholar 

  60. Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: International conference on discovery science. Springer, pp. 264–269

  61. Bergmann B, Hommel G (1988) Improvements of general multiple test procedures for redundant systems of hypotheses. In: Multiple Hypothesenprüfung/Multiple Hypotheses Testing. Springer, Berlin and Heidelberg, pp. 100–115

  62. Picasso A, Merello S, Ma Y, Oneto L, Cambria E (2019) Technical analysis and sentiment embeddings for market trend prediction. Expert Syst Appl 135(30):60–70

    Article  Google Scholar 

Download references

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Author information

Authors and Affiliations

Authors

Contributions

LFPCF was involved in conceptualization, methodology, software, validation, formal analysis, investigation and writing—original draft. GPC was responsible for conceptualization, writing—reviewing and editing, supervision and project administration.

Corresponding authors

Correspondence to Luis Fernando Panicachi Cocovilo Filho or Guilherme Palermo Coelho.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fernando Panicachi Cocovilo Filho, L., Palermo Coelho, G. Evaluating the impact of drift detection mechanisms on stock market forecasting. Knowl Inf Syst 66, 723–763 (2024). https://doi.org/10.1007/s10115-023-02025-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-02025-y

Keywords

Navigation