当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FBDD: feature-based drift detector for batch processing data
Cluster Computing ( IF 4.4 ) Pub Date : 2024-03-08 , DOI: 10.1007/s10586-024-04284-y
Piotr Porwik , Krzysztof Wrobel , Tomasz Orczyk , Rafał Doroz

Abstract

The concept and data drift problems have received much attention in recent years. This aspect is crucial in many domains exhibiting non-stationary and cyclical patterns affecting their generative processes. Drift detection can be treated as a supervised task, with labeled data constantly used to validate the learned model. From a practical point of view, this is an impractical task because labeling is complex, costly, and time-consuming. On the other hand, unsupervised change detection techniques are cumbersome in applications because they generate many false alarms. The paper presents a new concept drift detection method based on feature analysis. Stream of data carries information about the distribution patterns that reflect different concepts that may be hidden in the data. The essential features are searched and ranked by LASSO. The rank of features and statistics are employed to feature drift detection. The proposed approach was experimentally checked based on synthetic and natural datasets. The results show that the proposed FBDD algorithm has an advantage over other solutions.



中文翻译:

FBDD:用于批量处理数据的基于特征的漂移检测器

摘要

近年来,概念和数据漂移问题备受关注。这方面在许多表现出影响其生成过程的非平稳和周期性模式的领域中至关重要。漂移检测可以被视为一项监督任务,不断使用标记数据来验证学习的模型。从实践的角度来看,这是一项不切实际的任务,因为标记非常复杂、成本高昂且耗时。另一方面,无监督的变化检测技术在应用中很麻烦,因为它们会产生许多误报。提出一种新概念的基于特征分析的漂移检测方法。数据流携带有关分布模式的信息,这些信息反映了数据中可能隐藏的不同概念。基本特征通过 LASSO 进行搜索和排序。特征的排名和统计数据用于特征漂移检测。基于合成和自然数据集对所提出的方法进行了实验检查。结果表明,所提出的 FBDD 算法比其他解决方案具有优势。

更新日期:2024-03-09
down
wechat
bug