Abstract
The challenge of concept drift detection is crucial in machine learning, especially in dynamic contexts where the underlying data distribution can vary over time. For the purpose of identifying concept drift, we suggest a sliding adaptive beta distribution model (SABeDM) in this study. SABeDM combines the adaptive sliding window and beta distribution techniques to track modifications in the underlying distribution of the data stream. Several synthetic and real-world datasets are used to assess the proposed model, and it is then contrasted with cutting-edge drift detection systems. Regarding detecting true positive, false positive, false negative, and delay, our experimental results demonstrate that SABeDM works better than the currently used methods (SRP, ADWIN, DDM, and EDDM). Accuracy, precision, recall, and F1-score were also utilised as evaluation criteria. When used in a variety of applications, such as online learning, data stream mining, and real-time monitoring systems, SABeDM offers an effective and fast way to identify concept drift in a dynamic context. The proposed approach is a promising tool for machine learning practitioners to use in practical applications since it can help to enhance the dependability and accuracy of decision-making systems in dynamic situations.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author upon reasonable request.
References
Guo H, Li H, Sun N, Ren Q, Zhang A, Wang W (2023) Concept drift detection and accelerated convergence of online learning. Knowl Inf Syst 65(3):1005–1043. https://doi.org/10.1007/s10115-022-01790-6
Widmer G (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101. https://doi.org/10.1007/bf00116900
Abbasi A, Javed AR, Chakraborty C, Nebhen J, Zehra W, Jalil Z (2021) ElStream: an ensemble learning approach for concept drift detection in dynamic social big data stream learning. IEEE Access 9:66408–66419. https://doi.org/10.1109/ACCESS.2021.3076264
Shahraki A, Abbasi M, Taherkordi A, Jurcut AD (2022) A comparative study on online machine learning techniques for network traffic streams analysis. Comput Netw 207:108836. https://doi.org/10.1016/j.comnet.2022.108836
Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2013) A survey on concept drift adaptation. ACM Comput Surv 1(1):35. https://doi.org/10.1145/0000000.0000000
Bifet A, Holmes G, Pfahringer B, Kirkby R, and Gavaldà R (2009) New ensemble methods for evolving data streams. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 139–147. https://doi.org/10.1145/1557019.1557041
Yan MMW (2020) Accurate detecting concept drift in evolving data streams. ICT Express 6(4):332–338. https://doi.org/10.1016/j.icte.2020.05.011
Gama J, Žliobaitundefined I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv. https://doi.org/10.1145/2523813
Wald A (2004) Sequential analysis. Courier Corporation
Page ES (1954) Continuous inspection schemes. Biometrika 41(1/2):100. https://doi.org/10.2307/2333009
Gama J, Medas P, Castillo G, and Rodrigues P (2004) Learning with drift detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 3171, no Sept, pp 286–295. https://doi.org/10.1007/978-3-540-28645-5_29
Baena-Garcia M, Del Campo-Avila J, Fidalgo R, Bifet A, Gavalda R, and Morales-bueno R (2006) Early Drift Detection Method. In: 4th ECML PKDD International Workshop on Knowledge Discovery from Data Streams, vol 6, pp 77–86
Ross GJ, Adams NM, Tasoulis DK, Hand DJ (2012) Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn Lett 33(2):191–198. https://doi.org/10.1016/j.patrec.2011.08.019
Barros RSM, Cabral DRL, Gonçalves PM, Santos SGTC (2017) RDDM: reactive drift detection method. Expert Syst Appl 90:344–355. https://doi.org/10.1016/j.eswa.2017.08.023
Angelopoulos A et al (2021) Impact of classifiers to drift detection method: a comparison. In: Proceedings of the 22nd Engineering Applications of Neural Networks Conference, pp 399–410
Liu Z, Loo CK, Seera M (2019) Meta-cognitive recurrent recursive Kernel OS-ELM for concept drift handling. Appl Soft Comput J 75:494–507. https://doi.org/10.1016/j.asoc.2018.11.006
Dongre SS, Thomas A, Malik LG (2019) Detecting concept drift using HEDDM in data stream. Int J Intell Eng Inform 7(2/3):164. https://doi.org/10.1504/ijiei.2019.10020441
Bifet A and Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the 7th SIAM International Conference on Data Mining pp. 443–448. https://doi.org/10.1137/1.9781611972771.42
Gomes HM, Read J and Bifet A (2019) Streaming Random Patches for Evolving Data Stream Classification. In: 2019 IEEE International Conference on Data Mining (ICDM), pp 240–249. https://doi.org/10.1109/ICDM.2019.00034
Duda RO, Hart PE, and others (2006) Pattern classification. John Wiley \& Sons
Damgaard CF, Irvine KM (2019) Using the beta distribution to analyse plant cover data. J Ecol 107(6):2747–2759. https://doi.org/10.1111/1365-2745.13200
Yuan X, Chen C, Jiang M, Yuan Y (2019) Prediction interval of wind power using parameter optimised Beta distribution based LSTM model. Appl Soft Comput J 82:105550. https://doi.org/10.1016/j.asoc.2019.105550
Althubyani FA, Abd El-Bar AMT, Fawzy MA, Gemeay AM (2022) A new 3-parameter bounded beta distribution: properties, estimation, and applications. Axioms. https://doi.org/10.3390/axioms11100504
Santana-E-Silva JJ, Cribari-Neto F, Vasconcellos KLP (2022) Beta distribution misspecification tests with application to Covid-19 mortality rates in the United States. PLoS ONE. https://doi.org/10.1371/journal.pone.0274781
Serinaldi F, Lombardo F (2020) Probability distribution of waiting time of the kth extreme event under serial dependence. J Hydrol Eng 25(6):1–11. https://doi.org/10.1061/(asce)he.1943-5584.0001923
Skellam JG (1948) A probability distribution derived from the binomial distribution by regarding the probability of success as variable between the sets of trials. J Roy Stat Soc: Ser B (Methodol) 10(2):257–261. https://doi.org/10.1111/j.2517-6161.1948.tb00014.x
Han Y, Kim J, Ng HKT, Kim SW (2022) Logistic regression model for a bivariate binomial distribution with applications in baseball data analysis. Entropy 24(8):1–16. https://doi.org/10.3390/e24081138
Fleckenstein L, Kauschke S and Fürnkranz J (2019) Beta distribution drift detection for adaptive classifiers In: ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp 443–448
Beyene AA, Welemariam T, Persson M, Lavesson N (2015) Improved concept drift handling in surgery prediction and other applications. Knowl Inf Syst 44(1):177–196. https://doi.org/10.1007/s10115-014-0756-9
Agrawal R, Swami A, Imielinski T (1993) Database mining: a performance perspective. IEEE Trans Knowl Data Eng 5(6):914–925. https://doi.org/10.1109/69.250074
Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Networks 22(10):1517–1531. https://doi.org/10.1109/TNN.2011.2160459
Dua D and Graff C (2017) {UCI} Machine Learning Repository, [Online]. Available: http://archive.ics.uci.edu/ml
Rigatti SJ (2017) Random forest. J Insur Med 47(1):31–39. https://doi.org/10.17849/insm-47-01-31-39.1
Wang X, Kang Q, An J, Zhou M (2019) Drifted twitter spam classification using multiscale detection test on K-L divergence. IEEE Access 7:108384–108394. https://doi.org/10.1109/ACCESS.2019.2932018
Kahraman A, Kantardzic M, Kotan M (2022) Dynamic modeling with integrated concept drift detection for predicting real-time energy consumption of industrial machines. IEEE Access 10:104622–104635. https://doi.org/10.1109/ACCESS.2022.3210525
Lin L, Wen L, Lin L, Pei J, Yang H (2022) LCDD: detecting business process drifts based on local completeness. IEEE Trans Serv Comput 15(4):2086–2099. https://doi.org/10.1109/TSC.2020.3032787
Yang Z, Al-Dahidi S, Baraldi P, Zio E, Montelatici L (2020) A novel concept drift detection method for incremental learning in nonstationary environments. IEEE Trans Neural Netw Learn Syst 31(1):309–320. https://doi.org/10.1109/TNNLS.2019.2900956
Author information
Authors and Affiliations
Contributions
Angbera Ature wrote the main manuscript text and conducted all the experiments and analysis, while H.Y. Chan supervised the work. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Angbera, A., Chan, H.Y. SABeDM: a sliding adaptive beta distribution model for concept drift detection in a dynamic environment. Knowl Inf Syst 66, 2039–2062 (2024). https://doi.org/10.1007/s10115-023-02004-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-02004-3