当前位置: X-MOL 学术Comput. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Computation of persistent homology on streaming data using topological data summaries
Computational Intelligence ( IF 2.8 ) Pub Date : 2023-07-30 , DOI: 10.1111/coin.12597
Anindya Moitra 1 , Nicholas O. Malott 1 , Philip A. Wilsey 1
Affiliation  

Persistent homology is a computationally intensive and yet extremely powerful tool for Topological Data Analysis. Applying the tool on potentially infinite sequence of data objects is a challenging task. For this reason, persistent homology and data stream mining have long been two important but disjoint areas of data science. The first computational model, that was recently introduced to bridge the gap between the two areas, is useful for detecting steady or gradual changes in data streams, such as certain genomic modifications during the evolution of species. However, that model is not suitable for applications that encounter abrupt changes of extremely short duration. This paper presents another model for computing persistent homology on streaming data that addresses the shortcoming of the previous work. The model is validated on the important real-world application of network anomaly detection. It is shown that in addition to detecting the occurrence of anomalies or attacks in computer networks, the proposed model is able to visually identify several types of traffic. Moreover, the model can accurately detect abrupt changes of extremely short as well as longer duration in the network traffic. These capabilities are not achievable by the previous model or by traditional data mining techniques.

中文翻译:

使用拓扑数据摘要计算流数据上的持久同源性

持久同源性是一种计算密集型但极其强大的拓扑数据分析工具。将工具应用于可能无限的数据对象序列是一项具有挑战性的任务。因此,持久同源性和数据流挖掘长期以来一直是数据科学的两个重要但互不相交的领域。最近引入的第一个计算模型是为了弥合两个领域之间的差距,可用于检测数据流的稳定或逐渐变化,例如物种进化过程中的某些基因组修改。然而,该模型不适合遇到持续时间极短的突变的应用程序。本文提出了另一种计算流数据持久同源性的模型,解决了先前工作的缺点。该模型在网络异常检测的重要现实应用中得到了验证。结果表明,除了检测计算机网络中异常或攻击的发生外,所提出的模型还能够直观地识别多种类型的流量。此外,该模型可以准确地检测网络流量中极短和较长持续时间的突变。这些功能是以前的模型或传统的数据挖掘技术无法实现的。
更新日期:2023-07-30
down
wechat
bug