Multimodal Interactive Network for Sequential Recommendation,Journal of Computer Science and Technology

当前位置： X-MOL 学术 › J. Comput. Sci. Tech. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multimodal Interactive Network for Sequential Recommendation
Journal of Computer Science and Technology ( IF 1.9 ) Pub Date : 2023-07-31 , DOI: 10.1007/s11390-022-1152-7
Teng-Yue Han , Peng-Fei Wang , Shao-Zhang Niu

Building an effective sequential recommendation system is still a challenging task due to limited interactions among users and items. Recent work has shown the effectiveness of incorporating textual or visual information into sequential recommendation to alleviate the data sparse problem. The data sparse problem now is attracting a lot of attention in both industry and academic community. However, considering interactions among modalities on a sequential scenario is an interesting yet challenging task because of multimodal heterogeneity. In this paper, we introduce a novel recommendation approach of considering both textual and visual information, namely Multimodal Interactive Network (MIN). The advantage of MIN lies in designing a learning framework to leverage the interactions among modalities from both the item level and the sequence level for building an efficient system. Firstly, an item-wise interactive layer based on the encoder-decoder mechanism is utilized to model the item-level interactions among modalities to select the informative information. Secondly, a sequence interactive layer based on the attention strategy is designed to capture the sequence-level preference of each modality. MIN seamlessly incorporates interactions among modalities from both the item level and the sequence level for sequential recommendation. It is the first time that interactions in each modality have been explicitly discussed and utilized in sequential recommenders. Experimental results on four real-world datasets show that our approach can significantly outperform all the baselines in sequential recommendation task.

中文翻译：

用于顺序推荐的多模态交互网络

由于用户和项目之间的交互有限，构建有效的顺序推荐系统仍然是一项具有挑战性的任务。最近的工作表明，将文本或视觉信息合并到顺序推荐中可以有效缓解数据稀疏问题。数据稀疏问题现在引起了工业界和学术界的广泛关注。然而，由于多模态异质性，考虑顺序场景中模态之间的相互作用是一项有趣但具有挑战性的任务。在本文中，我们介绍了一种同时考虑文本和视觉信息的新颖推荐方法，即多模态交互网络（MIN）。MIN 的优势在于设计一个学习框架，以利用项目级别和序列级别的模式之间的交互来构建高效的系统。首先，利用基于编码器-解码器机制的逐项交互层对模态之间的项级交互进行建模，以选择信息信息。其次，设计了基于注意力策略的序列交互层来捕获每种模态的序列级偏好。MIN 无缝地结合了项目级别和序列级别的模式之间的交互，以进行顺序推荐。这是第一次在顺序推荐器中明确讨论和利用每种模式中的交互。四个真实世界数据集的实验结果表明，我们的方法可以显着优于顺序推荐任务中的所有基线。

更新日期：2023-07-31

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>