Unsupervised multimodal learning for image-text relation classification in tweets,Pattern Analysis and Applications

当前位置： X-MOL 学术 › Pattern Anal. Applic. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unsupervised multimodal learning for image-text relation classification in tweets
Pattern Analysis and Applications ( IF 3.9 ) Pub Date : 2023-10-10 , DOI: 10.1007/s10044-023-01204-5
Lin Sun , Qingyuan Li , Long Liu , Yindu Su

Recent studies show that the use of multimodality can effectively enhance the understanding of social media content. The relations between texts and images become an important basis for developing multimodal data and models. Some studies have attempted to label image-text relation (ITR) and build supervised learning models. However, manually labeling ITR is a challenging task and incurs many controversial labels because of disagreements among the annotators. In this paper, we present a novel unsupervised multimodal method called ITR pseudo-labeling (ITRp) that learns multimodal representations for various ITR types using different finetuning strategies. Our ITRp method generates pseudo-labels by clustering and uses them as supervision to train the classifier and encoders. We evaluate the ITRp method on the ITR dataset and the effects of the samples with incorrect labels on both the supervised and unsupervised models. The code and data are available on the website https://github.com/SuYindu/ITRp.

中文翻译：

用于推文中图像文本关系分类的无监督多模态学习

最近的研究表明，多模态的使用可以有效增强对社交媒体内容的理解。文本和图像之间的关系成为开发多模态数据和模型的重要基础。一些研究尝试标记图像文本关系（ITR）并建立监督学习模型。然而，手动标记 ITR 是一项具有挑战性的任务，并且由于注释者之间的分歧而引发许多有争议的标签。在本文中，我们提出了一种称为 ITR 伪标签（ITRp）的新型无监督多模态方法，该方法使用不同的微调策略来学习各种 ITR 类型的多模态表示。我们的 ITRp 方法通过聚类生成伪标签，并将其用作监督来训练分类器和编码器。我们在 ITR 数据集上评估 ITRp 方法，以及带有错误标签的样本对监督和无监督模型的影响。代码和数据可在网站 https://github.com/SuYindu/ITRp 上获取。

更新日期：2023-10-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>