Multi-Instance Learning with One Side Label Noise,ACM Transactions on Knowledge Discovery from Data

当前位置： X-MOL 学术 › ACM Trans. Knowl. Discov. Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-Instance Learning with One Side Label Noise
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2024-03-26 , DOI: 10.1145/3644076
Tianxiang Luan ₁ , Shilin Gu ₁ , Xijia Tang ₁ , Wenzhang Zhuge ₁ , Chenping Hou ₁

Affiliation

Multi-instance Learning (MIL) is a popular learning paradigm arising from many real applications. It assigns a label to a set of instances, which is called a bag, and the bag’s label is determined by the instances within it. A bag is positive if and only if it has at least one positive instance. Since labeling bags is more complicated than labeling each instance, we will often face the mislabeling problem in MIL. Furthermore, it is more common that a negative bag has been mislabeled to a positive one, since one mislabeled instance will lead to the change of the whole bag label. This is an important problem that originated from real applications, e.g., web mining and image classification, but little research has concentrated on it as far as we know. In this article, we focus on this MIL problem with one side label noise that the negative bags are mislabeled as positive ones. To address this challenging problem, we propose, to the best our our knowledge, a novel multi-instance learning method with one side label noise. We design a new double weighting approach under traditional framework to characterize the “faithfulness” of each instance and each bag in learning the classifier. Briefly, on the instance level, we employ a sparse weighting method to select the key instances, and the MIL problem with one size label noise is converted to a mislabeled supervised learning scenario. On the bag level, the weights of bags, together with the selected key instances, will be utilized to identify the real positive bags. In addition, we have solved our proposed model by an alternative iteration method with proved convergence behavior. Empirical studies on various datasets have validated the effectiveness of our method.

中文翻译：

具有一侧标签噪声的多实例学习

多实例学习（MIL）是一种流行的学习范式，源于许多实际应用。它为一组实例分配一个标签，这组实例称为包，包的标签由其中的实例确定。当且仅当一个包至少有一个正实例时，它才是正的。由于给袋子贴标签比给每个实例贴标签更复杂，因此我们在 MIL 中经常会遇到贴错标签的问题。此外，更常见的是负数袋子被错误地标记为正数袋子，因为一个错误标记的情况会导致整个袋子标签的改变。这是一个源于实际应用的重要问题，例如网络挖掘和图像分类，但据我们所知，很少有研究集中于此。在本文中，我们重点关注这一 MIL 问题，其中包括负数袋被错误标记为正数袋的一侧标签噪音。为了解决这个具有挑战性的问题，我们据我们所知提出了一种具有单侧标签噪声的新颖的多实例学习方法。我们在传统框架下设计了一种新的双加权方法来表征每个实例和每个包在学习分类器时的“忠实度”。简而言之，在实例级别，我们采用稀疏加权方法来选择关键实例，并将具有单一大小标签噪声的 MIL 问题转换为错误标签的监督学习场景。在行李级别，行李的重量以及选定的关键实例将用于识别真正的阳性行李。此外，我们还通过具有经过验证的收敛行为的替代迭代方法解决了我们提出的模型。对各种数据集的实证研究验证了我们方法的有效性。

更新日期：2024-03-26

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>