当前位置: X-MOL 学术Knowl. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An automated approach for binary classification on imbalanced data
Knowledge and Information Systems ( IF 2.7 ) Pub Date : 2024-01-12 , DOI: 10.1007/s10115-023-02046-7
Pedro Marques Vieira , Fátima Rodrigues

Imbalanced data are present in various business sectors and must be handled with the proper resampling methods and classification algorithms. To handle imbalanced data, there are numerous resampling and learning method combinations; nonetheless, their effective use necessitates specialised knowledge. In this paper, several approaches, ranging from more accessible to more advanced in the domain of data resampling techniques, will be considered to handle imbalanced data. The application developed delivers recommendations of the most suitable combinations of techniques for a specific dataset by extracting and comparing dataset meta-feature values recorded in a knowledge base. It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results than state-of-the-art solutions and with a much smaller execution time.



中文翻译:

一种对不平衡数据进行二元分类的自动化方法

不平衡数据存在于各个业务领域,必须采用适当的重采样方法和分类算法来处理。为了处理不平衡数据,有多种重采样和学习方法组合;然而,它们的有效使用需要专门知识。在本文中,将考虑几种方法来处理不平衡数据,从数据重采样技术领域更容易使用到更先进的方法。开发的应用程序通过提取和比较知识库中记录的数据集元特征值,为特定数据集提供最合适的技术组合的建议。它有助于轻松分类并自动化部分机器学习管道,其结果与最先进的解决方案相当或更好,并且执行时间要短得多。

更新日期:2024-01-12
down
wechat
bug