当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploratory machine learning with unknown unknowns
Artificial Intelligence ( IF 14.4 ) Pub Date : 2023-12-19 , DOI: 10.1016/j.artint.2023.104059
Peng Zhao , Jia-Wei Shan , Yu-Jie Zhang , Zhi-Hua Zhou

In conventional supervised learning, a training dataset is given with ground-truth labels from a known label set, and the learned model will classify unseen instances to known labels. This paper studies a new problem setting in which there are unknown classes in the training data misperceived as other labels, and thus their existence appears unknown from the given supervision. We attribute the unknown unknowns to the fact that the training dataset is badly advised by the incompletely perceived label space due to the insufficient feature information. To this end, we propose the exploratory machine learning, which examines and investigates training data by actively augmenting the feature space to discover potentially hidden classes. Our method consists of three ingredients including rejection model, feature exploration, and model cascade. We provide theoretical analysis to justify its superiority, and validate the effectiveness on both synthetic and real datasets.



中文翻译:

具有未知未知数的探索性机器学习

在传统的监督学习中,训练数据集被赋予来自已知标签集的真实标签,并且学习模型将不可见的实例分类为已知标签。本文研究了一种新的问题设置,其中训练数据中存在未知类别,被误认为是其他标签,因此它们的存在在给定的监督下似乎是未知的。我们将未知的未知数归因于这样的事实:由于特征信息不足,训练数据集被不完全感知的标签空间所误导。为此,我们提出了探索性机器学习,它通过主动增强特征空间来检查和调查训练数据以发现潜在的隐藏类。我们的方法由三个要素组成,包括拒绝模型、特征探索和模型级联。我们提供理论分析来证明其优越性,并验证合成数据集和真实数据集的有效性。

更新日期:2023-12-19
down
wechat
bug