Incremental feature selection approach to multi-dimensional variation based on matrix dominance conditional entropy for ordered data set,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Incremental feature selection approach to multi-dimensional variation based on matrix dominance conditional entropy for ordered data set
Applied Intelligence ( IF 5.3 ) Pub Date : 2024-04-10 , DOI: 10.1007/s10489-024-05411-3
Weihua Xu , Yifei Yang , Yi Ding , Xiyang Chen , Xiaofang Lv

Rough set theory is a mathematical tool widely employed in various fields to handle uncertainty. Feature selection, as an essential and independent research area within rough set theory, aims to identify a small subset of important features by eliminating irrelevant, redundant, or noisy ones. In human life, data characteristics constantly change over time and other factors, resulting in ordered datasets with varying features. However, existing feature extraction methods are not suitable for handling such datasets since they do not consider previous reduction results when features change and need to be recomputed, leading to significant time consumption. To address this issue, the incremental attribute reduction algorithm utilizes prior reduction results effectively reducing computation time. Motivated by this approach, this paper investigates incremental feature selection algorithms for ordered datasets with changing features. Firstly, we discuss the dominant matrix and the dominance conditional entropy while introducing update principles for the new dominant matrix and dominance diagonal matrix when features change. Subsequently, we propose two incremental feature selection algorithms for adding (IFS-A) or deleting (IFS-D) features in ordered data set. Additionally, nine UCI datasets are utilized to evaluate the performance of our proposed algorithm. The experimental results validate that the average classification accuracy of IFS-A and IFS-D under four classifiers on twelve datasets is 82.05% and 80.75%, which increases by 5.48% and 3.68% respectively compared with the original data.

中文翻译：

基于矩阵优势条件熵的有序数据集多维变异增量特征选择方法

粗糙集理论是广泛应用于各个领域处理不确定性的数学工具。特征选择作为粗糙集理论中一个重要且独立的研究领域，旨在通过消除不相关、冗余或噪声特征来识别重要特征的一小部分子集。在人类生活中，数据特征随着时间和其他因素不断变化，从而产生具有不同特征的有序数据集。然而，现有的特征提取方法不适合处理此类数据集，因为当特征发生变化并需要重新计算时，它们没有考虑先前的缩减结果，从而导致大量时间消耗。为了解决这个问题，增量属性约简算法利用先前的约简结果，有效地减少了计算时间。受这种方法的启发，本文研究了具有变化特征的有序数据集的增量特征选择算法。首先讨论主导矩阵和主导条件熵，同时介绍特征变化时新的主导矩阵和主导对角矩阵的更新原理。随后，我们提出了两种增量特征选择算法，用于在有序数据集中添加（IFS-A）或删除（IFS-D）特征。此外，还利用九个 UCI 数据集来评估我们提出的算法的性能。实验结果验证了IFS-A和IFS-D在4个分类器下在12个数据集上的平均分类准确率分别为82.05%和80.75%，与原始数据相比分别提高了5.48%和3.68%。

更新日期：2024-04-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>