Skip to main content
Log in

A semi-supervised interactive algorithm for change point detection

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

A Correction to this article was published on 02 January 2024

This article has been updated

Abstract

The goal of change point detection (CPD) is to identify abrupt changes in the statistics of signals or time series that reflect transitions in the underlying system’s properties or states. While many statistical and learning-based approaches have been proposed to address this task, most state-of-the-art methods still treat this problem in an unsupervised setting. As a result, there is often a large gap between the algorithm-detected results and the expected outcomes of the user. To bridge this gap, we propose an active-learning strategy for the CPD problem that combines with the one-class support vector machine (OCSVM) model, resulting in an interactive CPD algorithm that improves itself by querying the end-user. This approach enables us to focus on detecting the desired change points and ignore false-positives or irrelevant change points. We demonstrate that the interactive OCSVM model can be combined with various unsupervised CPD models to function in a semi-supervised setting, resulting in improved detection accuracy. Our experimental results on various simulated and real-life datasets demonstrate a significant improvement in detection performance on both single- and multi-channel time series, even with a limited number of queries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Code Availability

The code described in the article is available in the following repository: https://github.com/caozhenxiang/ICPD.

Change history

Notes

  1. https://github.com/arnedb/tsfuse.

References

  • Adams RP, MacKay DJC (2007) Bayesian online changepoint detection. Mach Learn

  • Aminikhanghahi S, Cook DJ (2017) A survey of methods for time series change point detection. Knowl Inf Syst 51(2):339–367. https://doi.org/10.1007/s10115-016-0987-z

    Article  PubMed  Google Scholar 

  • An J, Cho S (2015) Variational autoencoder based anomaly detection using reconstruction probability. Spec Lect IE 2(1):1–18

    Google Scholar 

  • Appel U, Brandt AV (1983) Adaptive sequential segmentation of piecewise stationary time series. Inf Sci 29(1):27–56

    Article  Google Scholar 

  • Basseville M, Nikiforov IV, et al (1993) Detection of abrupt changes: theory and application, vol 104. Prentice Hall, Englewood Cliffs

  • Bellinger C, Sharma S, Japkowicz N (2012) One-class versus binary classification: Which and when? In: 2012 11th international conference on machine learning and applications, vol 2, pp 102–106. https://doi.org/10.1109/ICMLA.2012.212

  • Bosc M, Heitz F, Armspach J-P, Namer I, Gounot D, Rumbach L (2003) Automatic change detection in multimodal serial MRI: application to multiple sclerosis lesion evolution. Neuroimage 20(2):643–656

    Article  PubMed  Google Scholar 

  • Brandt AV (1983) Detecting and estimating parameter jumps using ladder algorithms and likelihood ratio tests. In: ICASSP’83. IEEE international conference on acoustics, speech, and signal processing, vol 8. IEEE, pp 1017–1020

  • Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, pp 93–104

  • Chandola V, Vatsavai RR (2010) Scalable time series change detection for biomass monitoring using gaussian process. In: CIDU, pp 69–82

  • Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):1–58

    Article  Google Scholar 

  • Chang W-C, Li C-L, Yang Y, Póczos B (2019) Kernel change-point detection with auxiliary deep generative models. arXiv:1901.06077

  • Cheng KC, Miller EL, Hughes MC, Aeron S (2020) On matched filtering for statistical change point detection. IEEE Open J Signal Process 1:159–176

    Article  Google Scholar 

  • Chib S (1998) Estimation and comparison of multiple change-point models. J Econom 86(2):221–241

    Article  MathSciNet  Google Scholar 

  • Cho H, Fryzlewicz P(2015) Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. J R Sta Soc Ser B Stat Methodol 475–507

  • Cleland I, Han M, Nugent C, Lee H, McClean S, Zhang S, Lee S (2014) Evaluation of prompted annotation of activity data recorded from a smart phone. Sensors 14(9):15861–15879

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Article  Google Scholar 

  • De Brabandere A, Cao Z, De Vos M, Bertrand A, Davis J (2022) Semi-supervised change point detection using active learning. In: International conference on discovery science. Springer, pp 74–88

  • De Brabandere A, Op De Beéck T, Hendrickx K, Meert W, Davis J (2022) TSFuse: Automated feature construction for multiple time series data. Machine Learning, pp 1–56. https://doi.org/10.1007/s10994-021-06096-2

  • De Ryck T, De Vos M, Bertrand A (2021) Change point detection in time series data using autoencoders with a time-invariant representation. IEEE Trans Signal Process

  • Deldari S, Smith DV, Xue H, Salim FD (2021) Time series change point detection with self-supervised contrastive predictive coding. In: Proceedings of the web conference 2021, pp 3124–3135

  • Desobry F, Davy M, Doncarli C (2005) An online kernel change detection algorithm. IEEE Trans Signal Process 53(8):2961–2974

    Article  ADS  MathSciNet  Google Scholar 

  • Ducré-Robitaille J-F, Vincent LA, Boulet G (2003) Comparison of techniques for detection of discontinuities in temperature series. Int J Climatol A J R Meteorol Soc 23(9):1087–1101

    Article  Google Scholar 

  • Ebrahimzadeh Z, Zheng M, Karakas S, Kleinberg S (2019) Deep learning for multi-scale changepoint detection in multivariate time series

  • Gupta M, Wadhvani R, Rasool A (2022) Real-time change-point detection: a deep neural network-based adaptive approach for detecting changes in multivariate time series data. Expert Syst Appl 209:118260. https://doi.org/10.1016/j.eswa.2022.118260

    Article  Google Scholar 

  • Itoh N, Kurths J (2010) Change-point detection of climate time series by nonparametric method. In: Proceedings of the world congress on engineering and computer science, vol 1. Citeseer, pp 445–448

  • Lee W-H, Ortiz J, Ko B, Lee R (2018) Time series segmentation through automatic feature learning. arXiv:1801.05394

  • Li J, Lei P, Todorovic S (2019) Weakly supervised energy-based learning for action segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6243–6251

  • Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining. IEEE, pp 413–422

  • Liu S, Yamada M, Collier N, Sugiyama M (2013) Change-point detection in time-series data by relative density-ratio estimation. Neural Netw 43:72–83

    Article  PubMed  Google Scholar 

  • Liu W, Li JQ, Wenying Yu, Yang G (2021) Change-point detection approaches for pavement dynamic segmentation. J Transp Eng Part B: Pavements 147(2):06021001

    Google Scholar 

  • Malladi R, Kalamangalam GP, Aazhang B (2013) Online Bayesian change point detection algorithms for segmentation of epileptic activity. In: 2013 Asilomar conference on signals, systems and computers. IEEE, pp 1833–1837

  • Munir M, Siddiqui SA, Dengel A, Ahmed S (2018) DeePanT: A deep learning approach for unsupervised anomaly detection in time series. IEEE Access 7:1991–2005

    Article  Google Scholar 

  • Oikarinen E, Tiittanen HE, Henelius A, Puolamäki K (2021) Detecting virtual concept drift of regressors without ground truth values. Data Min Knowl Disc 35:726–747. https://doi.org/10.1007/s10618-021-00739-7

    Article  MathSciNet  Google Scholar 

  • Perslev M, Jensen MH, Darkner S, Jennum PJ, Igel C (2019) U-time: a fully convolutional network for time series segmentation applied to sleep staging. arXiv:1910.11162

  • Reddy S, Mun M, Burke J, Estrin D, Hansen M, Srivastava M (2010) Using mobile phones to determine transportation modes. ACM Trans Sens Netw (TOSN) 6(2):1–27

    Article  Google Scholar 

  • Reeves J, Chen J, Wang XL, Lund R, Qi Qi L (2007) A review and comparison of changepoint detection techniques for climate data. J Appl Meteorol Climatol 46(6):900–915

    Article  ADS  Google Scholar 

  • Schölkopf B, Williamson RC, Smola A, Shawe-Taylor J, Platt J (1999) Support vector method for novelty detection. Adv Neural Inf Process Syst 12

  • Shi Z, Chehade A (2021) A dual-LSTM framework combining change point detection and remaining useful life prediction. Reliab Eng Syst Saf 205:107257

    Article  Google Scholar 

  • Shou MZ, Lei SW, Wang W, Ghadiyaram D, Feiszli M (2021) Generic event boundary detection: a benchmark for event segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8075–8084

  • Staudacher M, Telser S, Amann A, Hinterhuber H, Ritsch-Marte M (2005) A new method for change-point detection developed for on-line analysis of the heart beat variability during sleep. Physica A 349(3–4):582–596

    Article  ADS  Google Scholar 

  • Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54:45–66

    Article  Google Scholar 

  • Truong C, Oudre L, Vayatis N (2020) Selective review of offline change point detection methods. Signal Process 167:107299. https://doi.org/10.1016/j.sigpro.2019.107299

    Article  Google Scholar 

  • Turner RD (2012) Gaussian processes for state space models and change point detection. PhD thesis, University of Cambridge

  • van den Burg GJJ, Williams CKI (2020) An evaluation of change point detection algorithms

  • Xuan X, Murphy K (2007) Modeling changing dependency structure in multivariate time series. In: Proceedings of the 24th international conference on machine learning, ICML ’07. Association for Computing Machinery, New York, NY, USA, pp 1055–1062. https://doi.org/10.1145/1273496.1273629

  • Yang P, Dumont G, Ansermino JM (2006) Adaptive change detection in heart rate trend monitoring in anesthetized children. IEEE Trans Biomed Eng 53(11):2211–2219. https://doi.org/10.1109/TBME.2006.877107

    Article  PubMed  Google Scholar 

  • Zhang R, Hao Y, Yu D, Chang W-C, Lai G, Yang Y (2020) Correlation-aware unsupervised change-point detection via graph neural networks

  • Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674

Download references

Funding

This research received funding from the Flemish Government (AI Research Program) and from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 802895). All authors are affiliated to Leuven.AI—KU Leuven institute for AI, B-3000, Leuven, Belgium.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenxiang Cao.

Ethics declarations

Conflict of interest

We declare that we have no conflict of interest.

Additional information

Responsible editor: Panagiotis Papapetrou.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: In this article, the reference has been incorrectly published as “De Brabandere A, Cao Z, De Vos M, Bertrand A, Davis J (2022) TSFuse: automated feature construction for multiple time series data. Mach Learn. https://doi.org/10.1007/s10994-021-06096-2”. It should have been “De Brabandere, A., Op De Beéck, T., Hendrickx, K., Meert, W., & Davis, J. (2022). TSFuse: Automated feature construction for multiple time series data. Machine Learning, pp 1-56. https://doi.org/10.1007/s10994-021-06096-2”.

Appendix: Limitation of AUROC in evaluating CPD algorithms

Appendix: Limitation of AUROC in evaluating CPD algorithms

We discuss the limitations of using AUROC to evaluate CPD algorithms via a toy example. Before presenting the example, we will first introduce the definition of the AUROC metric in the context of CPD.

Since imbalanced data is common in CPD tasks, many papers in the CPD literature (e.g., KLCPD Chang et al. 2019, RuLSIF Liu et al. 2013, ABD Lee et al. 2018, and TIRE De Ryck et al. 2021) define the true positive rate (TPR) and false positive rate (FPR) as:

$$\begin{aligned} TPR = \dfrac{N_{TP}}{N_{GT}} \end{aligned}$$
(10)

and

$$\begin{aligned} FPR = \dfrac{N_{FP}}{N_{TP}+N_{FP}}. \end{aligned}$$
(11)

The ROC curve is then obtained by varying a detection threshold (\(\upsilon\)) from high to low. Point \((FPR, TPR) = (1.0, 1.0)\) is manually added to the ROC curve to ensure that a perfect performance corresponds to an AUROC of 1 (De Ryck et al. 2021).

Based on these definitions, we show an illustrative toy example in Fig. 7. Here, we show two possible detection results, e.g., by two hypothetical CPD algorithms. In Table 4, we summarize the number of samples in these two detection results and compute the f1-scores.

Fig. 7
figure 7

Toy example for showing a limitation of the AUROC metric. In the left column, the two sub-figures simulate two detection results obtained from two hypothetical CPD algorithms on the same time series. We mark the locations of ground truth change points with the vertical lines in red. The black curve denotes the dissimilarity measurement produced by the CPD algorithm (a high value corresponds to a potential CP). Green points and black points represent true positive and false positive samples, respectively. Three colored flat dashed lines represent three values of the detection threshold \(\upsilon\) during generating of the ROC curve. In the right column, the corresponding ROC curves are plotted. Three colored points on the ROC curves correspond to the \(\upsilon\) values in the left sub-figures in the same colors

Table 4 Summary of detection results in toy example

As shown in Table 4, Result 2 detects one more change point and one less flase negative thereby producing a better f1-score than Result 1. However, the AUROC achieved by Result 2 is much worse than that of Result 1. This is because CPD algorithm 2 assigns a higher dissimilarity value to the false positive sample located at time step 375, causing the corresponding ROC curve to start from point (1.0, 0.0), resulting in an awkward shape. In real-world applications, the dissimilarity produced by a CPD algorithm is determined locally and is usually very sensitive to outlier data samples and initial model parameters. This makes the start of the ROC curve very sensitive to stochastic effects, often leading to awkward shapes as in the second example of Fig. 7. The same CPD algorithm can even result in very different AUROC values on the same dataset due to different initial weights. This is why we chose to evaluate the CPD approaches in this study based on the f1-score.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, Z., Seeuws, N., Vos, M.D. et al. A semi-supervised interactive algorithm for change point detection. Data Min Knowl Disc 38, 623–651 (2024). https://doi.org/10.1007/s10618-023-00974-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-023-00974-0

Keywords

Navigation