当前位置: X-MOL 学术Entropy › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering
Entropy ( IF 2.7 ) Pub Date : 2024-03-30 , DOI: 10.3390/e26040308
Xinkai Sun 1, 2 , Sanguo Zhang 1, 2 , Shuangge Ma 3
Affiliation  

In the classification task, label noise has a significant impact on models’ performance, primarily manifested in the disruption of prediction consistency, thereby reducing the classification accuracy. This work introduces a novel prediction consistency regularization that mitigates the impact of label noise on neural networks by imposing constraints on the prediction consistency of similar samples. However, determining which samples should be similar is a primary challenge. We formalize the similar sample identification as a clustering problem and employ twin contrastive clustering (TCC) to address this issue. To ensure similarity between samples within each cluster, we enhance TCC by adjusting clustering prior to distribution using label information. Based on the adjusted TCC’s clustering results, we first construct the prototype for each cluster and then formulate a prototype-based regularization term to enhance prediction consistency for the prototype within each cluster and counteract the adverse effects of label noise. We conducted comprehensive experiments using benchmark datasets to evaluate the effectiveness of our method under various scenarios with different noise rates. The results explicitly demonstrate the enhancement in classification accuracy. Subsequent analytical experiments confirm that the proposed regularization term effectively mitigates noise and that the adjusted TCC enhances the quality of similar sample recognition.

中文翻译:

基于对比聚类的噪声标签学习预测一致性正则化

在分类任务中,标签噪声对模型的性能有显着影响,主要表现在破坏预测一致性,从而降低分类精度。这项工作引入了一种新颖的预测一致性正则化,通过对相似样本的预测一致性施加约束来减轻标签噪声对神经网络的影响。然而,确定哪些样本应该相似是一个主要挑战。我们将相似样本识别形式化为聚类问题,并采用孪生对比聚类(TCC)来解决这个问题。为了确保每个簇内样本之间的相似性,我们通过使用标签信息在分布之前调整聚类来增强 TCC。基于调整后的TCC聚类结果,我们首先为每个簇构建原型,然后制定基于原型的正则化项,以增强每个簇内原型的预测一致性并抵消标签噪声的不利影响。我们使用基准数据集进行了全面的实验,以评估我们的方法在不同噪声率的各种场景下的有效性。结果明确证明了分类准确性的提高。随后的分析实验证实,所提出的正则化项有效地减轻了噪声,并且调整后的TCC提高了相似样本识别的质量。
更新日期:2024-03-31
down
wechat
bug