Abstract
The goal of change point detection (CPD) is to identify abrupt changes in the statistics of signals or time series that reflect transitions in the underlying system’s properties or states. While many statistical and learning-based approaches have been proposed to address this task, most state-of-the-art methods still treat this problem in an unsupervised setting. As a result, there is often a large gap between the algorithm-detected results and the expected outcomes of the user. To bridge this gap, we propose an active-learning strategy for the CPD problem that combines with the one-class support vector machine (OCSVM) model, resulting in an interactive CPD algorithm that improves itself by querying the end-user. This approach enables us to focus on detecting the desired change points and ignore false-positives or irrelevant change points. We demonstrate that the interactive OCSVM model can be combined with various unsupervised CPD models to function in a semi-supervised setting, resulting in improved detection accuracy. Our experimental results on various simulated and real-life datasets demonstrate a significant improvement in detection performance on both single- and multi-channel time series, even with a limited number of queries.
Similar content being viewed by others
Code Availability
The code described in the article is available in the following repository: https://github.com/caozhenxiang/ICPD.
Change history
02 January 2024
A Correction to this paper has been published: https://doi.org/10.1007/s10618-023-01000-z
References
Adams RP, MacKay DJC (2007) Bayesian online changepoint detection. Mach Learn
Aminikhanghahi S, Cook DJ (2017) A survey of methods for time series change point detection. Knowl Inf Syst 51(2):339–367. https://doi.org/10.1007/s10115-016-0987-z
An J, Cho S (2015) Variational autoencoder based anomaly detection using reconstruction probability. Spec Lect IE 2(1):1–18
Appel U, Brandt AV (1983) Adaptive sequential segmentation of piecewise stationary time series. Inf Sci 29(1):27–56
Basseville M, Nikiforov IV, et al (1993) Detection of abrupt changes: theory and application, vol 104. Prentice Hall, Englewood Cliffs
Bellinger C, Sharma S, Japkowicz N (2012) One-class versus binary classification: Which and when? In: 2012 11th international conference on machine learning and applications, vol 2, pp 102–106. https://doi.org/10.1109/ICMLA.2012.212
Bosc M, Heitz F, Armspach J-P, Namer I, Gounot D, Rumbach L (2003) Automatic change detection in multimodal serial MRI: application to multiple sclerosis lesion evolution. Neuroimage 20(2):643–656
Brandt AV (1983) Detecting and estimating parameter jumps using ladder algorithms and likelihood ratio tests. In: ICASSP’83. IEEE international conference on acoustics, speech, and signal processing, vol 8. IEEE, pp 1017–1020
Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, pp 93–104
Chandola V, Vatsavai RR (2010) Scalable time series change detection for biomass monitoring using gaussian process. In: CIDU, pp 69–82
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):1–58
Chang W-C, Li C-L, Yang Y, Póczos B (2019) Kernel change-point detection with auxiliary deep generative models. arXiv:1901.06077
Cheng KC, Miller EL, Hughes MC, Aeron S (2020) On matched filtering for statistical change point detection. IEEE Open J Signal Process 1:159–176
Chib S (1998) Estimation and comparison of multiple change-point models. J Econom 86(2):221–241
Cho H, Fryzlewicz P(2015) Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. J R Sta Soc Ser B Stat Methodol 475–507
Cleland I, Han M, Nugent C, Lee H, McClean S, Zhang S, Lee S (2014) Evaluation of prompted annotation of activity data recorded from a smart phone. Sensors 14(9):15861–15879
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
De Brabandere A, Cao Z, De Vos M, Bertrand A, Davis J (2022) Semi-supervised change point detection using active learning. In: International conference on discovery science. Springer, pp 74–88
De Brabandere A, Op De Beéck T, Hendrickx K, Meert W, Davis J (2022) TSFuse: Automated feature construction for multiple time series data. Machine Learning, pp 1–56. https://doi.org/10.1007/s10994-021-06096-2
De Ryck T, De Vos M, Bertrand A (2021) Change point detection in time series data using autoencoders with a time-invariant representation. IEEE Trans Signal Process
Deldari S, Smith DV, Xue H, Salim FD (2021) Time series change point detection with self-supervised contrastive predictive coding. In: Proceedings of the web conference 2021, pp 3124–3135
Desobry F, Davy M, Doncarli C (2005) An online kernel change detection algorithm. IEEE Trans Signal Process 53(8):2961–2974
Ducré-Robitaille J-F, Vincent LA, Boulet G (2003) Comparison of techniques for detection of discontinuities in temperature series. Int J Climatol A J R Meteorol Soc 23(9):1087–1101
Ebrahimzadeh Z, Zheng M, Karakas S, Kleinberg S (2019) Deep learning for multi-scale changepoint detection in multivariate time series
Gupta M, Wadhvani R, Rasool A (2022) Real-time change-point detection: a deep neural network-based adaptive approach for detecting changes in multivariate time series data. Expert Syst Appl 209:118260. https://doi.org/10.1016/j.eswa.2022.118260
Itoh N, Kurths J (2010) Change-point detection of climate time series by nonparametric method. In: Proceedings of the world congress on engineering and computer science, vol 1. Citeseer, pp 445–448
Lee W-H, Ortiz J, Ko B, Lee R (2018) Time series segmentation through automatic feature learning. arXiv:1801.05394
Li J, Lei P, Todorovic S (2019) Weakly supervised energy-based learning for action segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6243–6251
Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining. IEEE, pp 413–422
Liu S, Yamada M, Collier N, Sugiyama M (2013) Change-point detection in time-series data by relative density-ratio estimation. Neural Netw 43:72–83
Liu W, Li JQ, Wenying Yu, Yang G (2021) Change-point detection approaches for pavement dynamic segmentation. J Transp Eng Part B: Pavements 147(2):06021001
Malladi R, Kalamangalam GP, Aazhang B (2013) Online Bayesian change point detection algorithms for segmentation of epileptic activity. In: 2013 Asilomar conference on signals, systems and computers. IEEE, pp 1833–1837
Munir M, Siddiqui SA, Dengel A, Ahmed S (2018) DeePanT: A deep learning approach for unsupervised anomaly detection in time series. IEEE Access 7:1991–2005
Oikarinen E, Tiittanen HE, Henelius A, Puolamäki K (2021) Detecting virtual concept drift of regressors without ground truth values. Data Min Knowl Disc 35:726–747. https://doi.org/10.1007/s10618-021-00739-7
Perslev M, Jensen MH, Darkner S, Jennum PJ, Igel C (2019) U-time: a fully convolutional network for time series segmentation applied to sleep staging. arXiv:1910.11162
Reddy S, Mun M, Burke J, Estrin D, Hansen M, Srivastava M (2010) Using mobile phones to determine transportation modes. ACM Trans Sens Netw (TOSN) 6(2):1–27
Reeves J, Chen J, Wang XL, Lund R, Qi Qi L (2007) A review and comparison of changepoint detection techniques for climate data. J Appl Meteorol Climatol 46(6):900–915
Schölkopf B, Williamson RC, Smola A, Shawe-Taylor J, Platt J (1999) Support vector method for novelty detection. Adv Neural Inf Process Syst 12
Shi Z, Chehade A (2021) A dual-LSTM framework combining change point detection and remaining useful life prediction. Reliab Eng Syst Saf 205:107257
Shou MZ, Lei SW, Wang W, Ghadiyaram D, Feiszli M (2021) Generic event boundary detection: a benchmark for event segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8075–8084
Staudacher M, Telser S, Amann A, Hinterhuber H, Ritsch-Marte M (2005) A new method for change-point detection developed for on-line analysis of the heart beat variability during sleep. Physica A 349(3–4):582–596
Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54:45–66
Truong C, Oudre L, Vayatis N (2020) Selective review of offline change point detection methods. Signal Process 167:107299. https://doi.org/10.1016/j.sigpro.2019.107299
Turner RD (2012) Gaussian processes for state space models and change point detection. PhD thesis, University of Cambridge
van den Burg GJJ, Williams CKI (2020) An evaluation of change point detection algorithms
Xuan X, Murphy K (2007) Modeling changing dependency structure in multivariate time series. In: Proceedings of the 24th international conference on machine learning, ICML ’07. Association for Computing Machinery, New York, NY, USA, pp 1055–1062. https://doi.org/10.1145/1273496.1273629
Yang P, Dumont G, Ansermino JM (2006) Adaptive change detection in heart rate trend monitoring in anesthetized children. IEEE Trans Biomed Eng 53(11):2211–2219. https://doi.org/10.1109/TBME.2006.877107
Zhang R, Hao Y, Yu D, Chang W-C, Lai G, Yang Y (2020) Correlation-aware unsupervised change-point detection via graph neural networks
Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674
Funding
This research received funding from the Flemish Government (AI Research Program) and from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 802895). All authors are affiliated to Leuven.AI—KU Leuven institute for AI, B-3000, Leuven, Belgium.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no conflict of interest.
Additional information
Responsible editor: Panagiotis Papapetrou.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: In this article, the reference has been incorrectly published as “De Brabandere A, Cao Z, De Vos M, Bertrand A, Davis J (2022) TSFuse: automated feature construction for multiple time series data. Mach Learn. https://doi.org/10.1007/s10994-021-06096-2”. It should have been “De Brabandere, A., Op De Beéck, T., Hendrickx, K., Meert, W., & Davis, J. (2022). TSFuse: Automated feature construction for multiple time series data. Machine Learning, pp 1-56. https://doi.org/10.1007/s10994-021-06096-2”.
Appendix: Limitation of AUROC in evaluating CPD algorithms
Appendix: Limitation of AUROC in evaluating CPD algorithms
We discuss the limitations of using AUROC to evaluate CPD algorithms via a toy example. Before presenting the example, we will first introduce the definition of the AUROC metric in the context of CPD.
Since imbalanced data is common in CPD tasks, many papers in the CPD literature (e.g., KLCPD Chang et al. 2019, RuLSIF Liu et al. 2013, ABD Lee et al. 2018, and TIRE De Ryck et al. 2021) define the true positive rate (TPR) and false positive rate (FPR) as:
and
The ROC curve is then obtained by varying a detection threshold (\(\upsilon\)) from high to low. Point \((FPR, TPR) = (1.0, 1.0)\) is manually added to the ROC curve to ensure that a perfect performance corresponds to an AUROC of 1 (De Ryck et al. 2021).
Based on these definitions, we show an illustrative toy example in Fig. 7. Here, we show two possible detection results, e.g., by two hypothetical CPD algorithms. In Table 4, we summarize the number of samples in these two detection results and compute the f1-scores.
As shown in Table 4, Result 2 detects one more change point and one less flase negative thereby producing a better f1-score than Result 1. However, the AUROC achieved by Result 2 is much worse than that of Result 1. This is because CPD algorithm 2 assigns a higher dissimilarity value to the false positive sample located at time step 375, causing the corresponding ROC curve to start from point (1.0, 0.0), resulting in an awkward shape. In real-world applications, the dissimilarity produced by a CPD algorithm is determined locally and is usually very sensitive to outlier data samples and initial model parameters. This makes the start of the ROC curve very sensitive to stochastic effects, often leading to awkward shapes as in the second example of Fig. 7. The same CPD algorithm can even result in very different AUROC values on the same dataset due to different initial weights. This is why we chose to evaluate the CPD approaches in this study based on the f1-score.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cao, Z., Seeuws, N., Vos, M.D. et al. A semi-supervised interactive algorithm for change point detection. Data Min Knowl Disc 38, 623–651 (2024). https://doi.org/10.1007/s10618-023-00974-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-023-00974-0