当前位置: X-MOL 学术Ann. Math. Artif. Intel. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A new definition for feature selection stability analysis
Annals of Mathematics and Artificial Intelligence ( IF 1.2 ) Pub Date : 2024-03-01 , DOI: 10.1007/s10472-024-09936-8
Teddy Lazebnik , Avi Rosenfeld

Feature selection (FS) stability is an important topic of recent interest. Finding stable features is important for creating reliable, non-overfitted feature sets, which in turn can be used to generate machine learning models with better accuracy and explanations and are less prone to adversarial attacks. There are currently several definitions of FS stability that are widely used. In this paper, we demonstrate that existing stability metrics fail to quantify certain key elements of many datasets such as resilience to data drift or non-uniformly distributed missing values. To address this shortcoming, we propose a new definition for FS stability inspired by Lyapunov stability in dynamic systems. We show the proposed definition is statistically different from the classical record-stability on (\(n=90\)) datasets. We present the advantages and disadvantages of using Lyapunov and other stability definitions and demonstrate three scenarios in which each one of the three proposed stability metrics is best suited.



中文翻译:

特征选择稳定性分析的新定义

特征选择(FS)稳定性是最近人们感兴趣的一个重要话题。找到稳定的特征对于创建可靠的、不过分拟合的特征集非常重要,这反过来又可用于生成具有更好准确性和解释性的机器学习模型,并且不易受到对抗性攻击。目前广泛使用的 FS 稳定性有多种定义。在本文中,我们证明现有的稳定性指标无法量化许多数据集的某些关键要素,例如对数据漂移或非均匀分布缺失值的恢复能力。为了解决这个缺点,受动态系统中李雅普诺夫稳定性的启发,我们提出了 FS 稳定性的新定义。我们证明所提出的定义在统计上与( \(n=90\) ) 数据集上的经典记录稳定性不同。我们介绍了使用李雅普诺夫和其他稳定性定义的优点和缺点,并演示了三种方案,其中所提出的三种稳定性指标中的每一种都是最适合的。

更新日期:2024-03-02
down
wechat
bug