A Software Defect Prediction Approach Based on Hybrid Feature Dimensionality Reduction,Scientific Programming

当前位置： X-MOL 学术 › Sci. Program. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Software Defect Prediction Approach Based on Hybrid Feature Dimensionality Reduction
Scientific Programming ( IF 1.672 ) Pub Date : 2023-7-20 , DOI: 10.1155/2023/5585130
Shenggang Zhang ₁ , Shujuan Jiang ₁ , Yue Yan ₁

Affiliation

Software defect prediction (SDP) is designed to assist software testing, which can reasonably allocate test resources to reduce costs and improve development efficiency. In order to improve the prediction performance, researchers have designed many defect-related features for SDP. However, feature redundancy (FR) and irrelevance caused by the increasing dimensions of data will greatly degrade the performance of defect prediction. In order to solve the problems, researchers have proposed various data dimensionality reduction methods. These methods can be simply divided into two categories of methods: feature selection and feature extraction. However, the two categories of methods have their own advantages and limitation. In this paper, we propose a Hybrid Feature Dimensionality Reduction Approach (HFDRA) for SDP, which combines the two different kinds of methods, to improve the performance of SDP. HFDRA approach can be divided into two stages: feature selection and feature extraction. First, HFDRA divides the original features into several feature subsets through a clustering algorithm in the feature selection stage. Then, in the feature extraction stage, kernel principal component analysis (KPCA) is used to reduce the dimensionality of each feature subset. Finally, the reduced-dimensional data is used to build the prediction model. In the empirical study, we use 22 projects from AEEEM, SOFTLAB, MORP, and ReLink as experiment object. In this paper, we first compare our approach with seven baseline methods and three state-of-the-art methods. Then, we analyze the relationship between FR and prediction performance. Experiment results show that our approach outperforms the state-of-the-art data dimensionality reduction methods for defect prediction.

中文翻译：

一种基于混合特征降维的软件缺陷预测方法

软件缺陷预测（SDP）旨在辅助软件测试，可以合理分配测试资源，降低成本，提高开发效率。为了提高预测性能，研究人员为SDP设计了许多与缺陷相关的特征。然而，数据维度增加导致的特征冗余（FR）和不相关性将极大地降低缺陷预测的性能。为了解决这些问题，研究人员提出了各种数据降维方法。这些方法可以简单地分为两类方法：特征选择和特征提取。然而，这两类方法都有各自的优点和局限性。在本文中，我们提出了一种用于 SDP 的混合特征降维方法（HFDRA），它结合了两种不同的方法，以提高SDP的性能。HFDRA方法可以分为两个阶段：特征选择和特征提取。首先，HFDRA在特征选择阶段通过聚类算法将原始特征划分为多个特征子集。然后，在特征提取阶段，使用核主成分分析（KPCA）来降低每个特征子集的维数。最后，使用降维数据构建预测模型。在实证研究中，我们使用AEEEM、SOFTLAB、MORP和ReLink的22个项目作为实验对象。在本文中，我们首先将我们的方法与七种基线方法和三种最先进的方法进行比较。然后，我们分析了 FR 和预测性能之间的关系。

更新日期：2023-07-20

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>