Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision,Journal of Computer Science and Technology

当前位置： X-MOL 学术 › J. Comput. Sci. Tech. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision
Journal of Computer Science and Technology ( IF 1.9 ) Pub Date : 2023-11-30 , DOI: 10.1007/s11390-022-1479-0
Gui-Rong Bai , Qing-Bin Liu , Shi-Zhu He , Kang Liu , Jun Zhao

Although neural approaches have yielded state-of-the-art results in the sentence matching task, their performance inevitably drops dramatically when applied to unseen domains. To tackle this cross-domain challenge, we address unsupervised domain adaptation on sentence matching, in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data. Specifically, we propose to perform self-supervised tasks to achieve it. Different from previous unsupervised domain adaptation methods, self-supervision can not only flexibly suit the characteristics of sentence matching with a special design, but also be much easier to optimize. When training, each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum, which gradually brings the two domains closer together along the direction relevant to the task. As a result, the classifier trained on the source domain is able to generalize to the unlabeled target domain. In total, we present three types of self-supervised tasks and the results demonstrate their superiority. In addition, we further study the performance of different usages of self-supervised tasks, which would inspire how to effectively utilize self-supervision for cross-domain scenarios.

中文翻译：

通过自我监督进行句子匹配的无监督领域适应

尽管神经方法在句子匹配任务中取得了最先进的结果，但当应用于看不见的领域时，它们的性能不可避免地会急剧下降。为了解决这一跨域挑战，我们解决了句子匹配的无监督域适应问题，其目标是仅使用未标记的目标域数据和标记的源域数据在目标域上具有良好的性能。具体来说，我们建议执行自我监督任务来实现这一目标。与以往的无监督领域适应方法不同，自监督不仅可以通过特殊设计灵活适应句子匹配的特点，而且更容易优化。训练时，每个自监督任务都会在由易到难的课程中同时在两个领域执行，这使得两个领域沿着与任务相关的方向逐渐靠近。因此，在源域上训练的分类器能够泛化到未标记的目标域。总的来说，我们提出了三种类型的自我监督任务，结果证明了它们的优越性。此外，我们进一步研究了自监督任务的不同用法的性能，这将启发如何在跨领域场景中有效地利用自监督。

更新日期：2023-11-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>