Abstract
The stock market is an important segment of the economy that circulates a large volume of assets. Several factors may affect the stock market transactions, leading to fluctuations in the stock values that may pose a problem for those who seek to forecast future stock values and maximize their profits. This issue is more serious when the stock values present the concept drift phenomenon, which means that the stock value’s patterns change over time. In this work, we aimed to evaluate whether machine learning-based predictors that incorporate mechanisms to deal with concept drift are suitable for stock market forecasting. To do so, a historic database of stock prices of 10 companies, negotiated in the Brazilian stock exchange and collected over 20 years, was used. We compared the performance of predictors based on different paradigms, with and without mechanisms to deal with concept drift, and the results showed that, although the strategies that handle concept drift demand longer computational times, they also tend to present smaller prediction errors. The highlight was the EOS-D approach, which had the best performance in 6 of the 10 stocks analyzed considering one-to-one comparisons.
Similar content being viewed by others
Data availability
The data sets generated and/or analyzed during the current study, together with the source codes (in Python), are available in the Repositório de Dados de Pesquisa da Unicamp (REDU), https://doi.org/10.25824/redu/WB2R60.
Notes
Available at: https://doi.org/10.25824/redu/WB2R60.
Abbreviations
- [B]\({}^{3}\) :
-
BRASIL BOLSA BALCÃO (main Brazilian stock exchange)
- ABEV3:
-
Ticker for Ambev
- ADWIN:
-
Adaptive windowing
- ANN:
-
Artificial neural network
- BBAS3:
-
Ticker for Banco do Brasil S.A.
- BBDC3:
-
Ticker for Banco Bradesco S.A.
- DDM:
-
Drift detection mechanism
- DOER:
-
Dynamic and online ensemble for regression
- DOER-RANK:
-
A variation of the DOER algorithm that incorporates component ranking
- DS:
-
Directional symmetry
- ECDD:
-
Exponentially weighted moving average for concept drift
- ELM:
-
Extreme learning machine
- EOS:
-
Ensemble of online learners with substitution of models
- EOS-D:
-
A variation of the EOS algorithm that incorporates dynamic adaptations
- EOS-RANK:
-
A variation of the EOS algorithm that incorporates component ranking
- GGBR3:
-
Ticker for Gerdau
- HDDM_A:
-
Drift detection method based on Hoeffding’s bounds using moving average
- HDDM_W:
-
Drift detection method based on Hoeffding’s bounds using weighted moving average
- ITUB3:
-
Ticker for Itaú Unibanco
- KNN:
-
K-Nearest neighbor
- LAME3:
-
Ticker for Lojas Americanas S.A.
- MAE:
-
Mean absolute error
- MAPE:
-
Mean absolute percentage error
- MSE:
-
Mean squared error
- MOA:
-
Massive online analysis
- OS-ELM:
-
Online sequential extreme learning machine
- PETR3:
-
Ticker for Petróleo Brasileiro S.A. - Petrobras
- PHT:
-
Page-Hinkley test
- \(PM_{2.5}\) :
-
Particulate matter 2.5
- RADL3:
-
Ticker for Raia Drogasil
- RDDM:
-
Reactive drift detection method
- RF:
-
Random forests
- RF-A:
-
A variation of the RF algorithm that retrains the model with each new sample
- RMS:
-
Root mean square
- RMSE:
-
Root mean square deviation
- SMAPE:
-
Symmetric mean absolute percentage error
- STEPD:
-
Statistical test of equal proportions
- SVM:
-
Support vector machine
- SVR:
-
Support vector regression
- VALE3:
-
Ticker for Vale S.A
- WDS:
-
Weighted directional symmetry
- WEGE3:
-
Ticker for Weg S.A
References
Naiara B VI (2020) B3 fala de mudanças feitas para embarcar os milhões de novos investidores de 2020. Last accessed 15 November 2022. https://valorinveste.globo.com/objetivo/hora-de-investir/noticia/2020/12/09/b3-fala-de-mudancas-feitas-para-embarcar-os-milhoes-de-novos-investidores-de-2020.ghtml
Rezende TM (2019) A meritocracia no mercado financeiro brasileiro. Master’s thesis, Fundação Getúlio Vargas
Nelson DMQ (2017) Uso de redes neurais recorrentes para previsão de séries temporais financeiras. Master’s thesis, Universidade Federal de Minas Gerais
Miranda AN (2019) Simcomben: combinando predições para séries financeiras similares a fim de prever a direção do movimento de preços de ações. Master’s thesis, Universidade Federal do Rio Grande do Sul
Fama EF (1991) Efficient capital markets: Ii. J Finance 46(5):1575–1617
Atsalakis GS, Valavanis KP (2009) Surveying stock market forecasting techniques-part ii: soft computing methods. Expert Syst Appl 36(3):5932–5941
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Waltham, MA
Lee M-C (2009) Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst Appl 36(8):10896–10904
Dongre PB, Malik LG (2014) A review on real time data stream classification and adapting to various concept drift scenarios. In: 2014 IEEE International advance computing conference (IACC), IEEE pp. 533–537
Harries M, Horn K (1995) Detecting concept drift in financial time series prediction using symbolic machine learning. In: AI-CONFERENCE-, pp. 91–98. Citeseer
Pinage FA, dos Santos EM (2015) A dissimilarity-based drift detection method. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), IEEE, pp. 1069–1076
Lima M, Neto M, Silva Filho T, Fagundes RAdA (2022) Learning under concept drift for regression-a systematic literature review. IEEE Access 10:45410–45429
Oliveira GH, Cavalcante RC, Cabral GG, Minku LL, Oliveira AL (2017) Time series forecasting in the presence of concept drift: a pso-based approach. In: 2017 IEEE 29th International conference on tools with artificial intelligence (ICTAI), IEEE, pp. 239–246
Cavalcante RC, Brasileiro RC, Souza VL, Nobrega JP, Oliveira AL (2016) Computational intelligence and financial markets: a survey and future directions. Expert Syst Appl 55:194–211
Shen S, Jiang H, Zhang T (2012) Stock market forecasting using machine learning algorithms. Department of Electrical Engineering, Stanford University, Stanford, CA, pp. 1–5
Atsalakis G, Valavanis KP (2010) Surveying stock market forecasting techniques-part i: conventional methods. J Comput Optimiz Econ Finance 2(1):45–92
Kumar G, Jain S, Singh UP (2021) Stock market forecasting using computational intelligence: a survey. Arch Computat Methods Eng 28:1069–1101
Rajput V, Bobde S (2016) Stock market forecasting techniques: literature survey. Int J Comput Sci Mob Comput 5(6):500–506
Bao Y, Lu Y, Zhang J (2004) Forecasting stock price by svms regression. In: International conference on artificial intelligence: methodology, systems, and applications, pp. 295–303. Springer
Alkhatib K, Najadat H, Hmeidi I, Shatnawi MKA (2013) Stock price prediction using k-nearest neighbor (knn) algorithm. Int J Bus, Humanit Technol 3(3):32–44
Vijh M, Chandola D, Tikkiwal VA, Kumar A (2020) Stock closing price prediction using machine learning techniques. Procedia Comput Sci 167:599–606
Kompella S, Chakravarthy Chilukuri K (2020) Stock market prediction using machine learning methods. Int J Comput Eng Technol 10(3):2019
Hu Y, Liu K, Zhang X, Xie K, Chen W, Zeng Y, Liu M (2015) Concept drift mining of portfolio selection factors in stock market. Electron Commer Res Appl 14(6):444–455
Cavalcante RC, Oliveira AL (2015) An approach to handle concept drift in financial time series based on extreme learning machines and explicit drift detection. In: 2015 International joint conference on neural networks (IJCNN), IEEE, pp. 1–8
Neri F (2021) Domain specific concept drift detectors for predicting financial time series. arXiv preprint arXiv:2103.14079
Zheng W, Zhao P, Chen G, Zhou H, Tian Y (2022) A hybrid spiking neurons embedded LSTM network for multivariate time series learning under concept-drift environment. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2022.3178176
Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):1–37
Bueno A, Coelho GP, Bertini Junior JR (2020) Dynamic ensemble mechanisms to improve particulate matter forecasting. Appl Soft Comput 91:106123
Liu Z, Loo CK, Seera M (2019) Meta-cognitive recurrent recursive kernel os-elm for concept drift handling. Appl Soft Comput 75:494–507
de Barros RSM, Santos SGTdC (2019) An overview and comprehensive comparison of ensembles for concept drift. Inform Fusion 52:213–244
Cavalcante RC, Minku LL, Oliveira AL (2016) Fedd: feature extraction for explicit concept drift detection in time series. In: 2016 International joint conference on neural networks (IJCNN). IEEE, pp. 740–747
Žliobaitė I (2010) Learning under concept drift: an overview. arXiv preprint arXiv:1010.4784
Žliobaitė I, Pechenizkiy M, Gama J (2016) An overview of concept drift applications. In: Big data analysis: new algorithms for a new society, Springer, Berlin, pp. 91–114
Iwashita AS, Papa JP (2018) An overview on concept drift learning. IEEE Access 7:1532–1547
Finanças Y (2022) Yahoo Finanças - Mercado de ações ao vivo, cotações e notícias de negócios e finanças. https://br.financas.yahoo.com/
Hasan MK, Alam MA, Das D, Hossain E, Hasan M (2020) Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access 8:76516–76531
Chaurasia V, Pal S (2020) Applications of machine learning techniques to predict diagnostic breast cancer. SN Computer Sci 1(5):1–11
Ogasawara E, Martinez LC, De Oliveira D, Zimbrão G, Pappa GL, Mattoso M (2010) Adaptive normalization: a novel data normalization approach for non-stationary time series. In: The 2010 international joint conference on neural networks (IJCNN), IEEE, pp. 1–8
Gupta V, Hewett R (2019) Adaptive normalization in streaming data. In: Proceedings of the 2019 3rd international conference on big data research, pp. 12–17
Shynkevich Y, McGinnity TM, Coleman S, Li Y, Belatreche A (2014) Forecasting stock price directional movements using technical indicators: investigating window size effects on one-step-ahead forecasting. In: 2014 IEEE conference on computational intelligence for financial engineering & economics (CIFEr). IEEE, pp. 341–348
Goyal R, Chandra P, Singh Y (2014) Suitability of KNN regression in the development of interaction based software fault prediction models. IERI Procedia 6:15–21
Quan Q, Hao Z, Xifeng H, Jingchun L (2020) Research on water temperature prediction based on improved support vector regression. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04836-4
Babar B, Luppino LT, Boström T, Anfinsen SN (2020) Random forest regression for improved mapping of solar irradiance at high latitudes. Sol Energy 198:81–92
Kramer O (2011) Unsupervised k-nearest neighbor regression. arXiv preprint arXiv:1107.3600
ScikitLearn: Sckikit Learn - Neighbors - KNeighborsRegressor (2022). https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html
Smith PF, Ganesh S, Liu P (2013) A comparison of random forest regression and multiple linear regression for prediction in neuroscience. J Neurosci Methods 220(1):85–91
ScikitLearn: Sckikit Learn - Ensemble - RandomForestRegressor. Last accessed 15 November 2022 (2022). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
Sheta AF, Ahmed SEM, Faris H (2015) A comparison between regression, artificial neural networks and support vector machines for predicting stock market index. Soft Comput 7(8):2
ScikitLearn: Sckikit Learn - SVM - SVR. Last accessed 15 November 2022 (2022). https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html
Ferreira P, Dionísio A, Guedes EF, Zebende GF (2018) A sliding windows approach to analyse the evolution of bank shares in the european union. Physica A 490:1355–1367
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
Barros RSM, Santos SGTC (2018) A large-scale comparison of concept drift detectors. Inf Sci 451:348–370
Gonçalves PM Jr, de Carvalho Santos SG, Barros RS, Vieira DC (2014) A comparative study on concept drift detectors. Expert Syst Appl 41(18):8144–8156
Frías-Blanco I, d. Campo-Ávila J, Ramos-Jiménez G, Morales-Bueno R, Ortiz-Díaz A, Caballero-Mota Y (2015) Online and non-parametric drift detection methods based on hoeffding’s bounds. IEEE Trans Knowl Data Eng 27(3):810–823. https://doi.org/10.1109/TKDE.2014.2345382
Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM international conference on data mining. SIAM, pp. 443–448
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Brazilian Symposium on Artificial Intelligence. Springer, pp. 286–295
Baena-Garcıa M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: Fourth International workshop on knowledge discovery from data streams, vol. 6, pp. 77–86
Barros RS, Cabral DR, Gonçalves PM Jr, Santos SG (2017) Rddm: reactive drift detection method. Expert Syst Appl 90:344–355
Page ES (1954) Continuous inspection schemes. Biometrika 41(1/2):100–115
Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: International conference on discovery science. Springer, pp. 264–269
Bergmann B, Hommel G (1988) Improvements of general multiple test procedures for redundant systems of hypotheses. In: Multiple Hypothesenprüfung/Multiple Hypotheses Testing. Springer, Berlin and Heidelberg, pp. 100–115
Picasso A, Merello S, Ma Y, Oneto L, Cambria E (2019) Technical analysis and sentiment embeddings for market trend prediction. Expert Syst Appl 135(30):60–70
Funding
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Author information
Authors and Affiliations
Contributions
LFPCF was involved in conceptualization, methodology, software, validation, formal analysis, investigation and writing—original draft. GPC was responsible for conceptualization, writing—reviewing and editing, supervision and project administration.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fernando Panicachi Cocovilo Filho, L., Palermo Coelho, G. Evaluating the impact of drift detection mechanisms on stock market forecasting. Knowl Inf Syst 66, 723–763 (2024). https://doi.org/10.1007/s10115-023-02025-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-02025-y