当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical bottleneck for heterogeneous graph representation
Information Sciences ( IF 8.1 ) Pub Date : 2024-03-07 , DOI: 10.1016/j.ins.2024.120422
Yunfei He , Li Meng , Jian Ma , Yiwen Zhang , Qun Wu , Weiping Ding , Fei Yang

Heterogeneous graphs (HGs) contain many nodes and their interaction relationships, which can model complex systems and provide rich semantic and structural information for task execution. Among these, HG representation stands as the fundamental and pivotal component. Existing HG representation methods primarily employ graph neural networks to acquire the semantics of nodes along various meta-paths and fuse them to represent the nodes. The most prevalent HG representation methods encompass two steps: semantic information extraction within meta-paths and semantic fusion between meta-paths. However, these methods overlooked the consideration of node heterogeneity within meta-paths and the simultaneous semantic correlation between meta-paths. Specifically, node heterogeneity within meta-paths signifies that the meta-path-based neighbors do not consistently contain information that positively influences the target node, and the semantic correlation between meta-paths indicates that different meta-path spaces are not entirely independent. Disregarding either of these issues leads to the propagation of irrelevant or redundant information and potential disruption of HG embedding. Consequently, in this study, we propose the HBHG, which is a hierarchical bottleneck for heterogeneous graph representation. HBHG primarily employs the information bottleneck (IB) as a guiding principle, constraining the propagation of irrelevant information within and between meta-paths while preserving relevant information. The central concept of the IB revolves around viewing model learning as the preservation of relevant information and compression of irrelevant information, accomplished by minimizing the dependency between input and hidden features through mutual information (MI) and maximizing the dependency between hidden features and ground-truth. Considering the complexity associated with MI estimation, this paper introduces a novel dependency index, namely the Hilbert-Schmidt independence criterion (HSIC), which offers ease of calculation. Specifically, HBHG comprises two primary components: a semantic bottleneck within meta-paths and a semantic bottleneck between meta-paths. The semantic bottleneck within meta-paths relies primarily on the HSIC-based limitations of dependencies at different layers of the graph neural network on various meta-paths, thereby maximizing the extraction of information relevant to the target node from neighboring nodes. The semantic bottleneck between meta-paths enables flexible extraction and fusion of semantic information based on downstream tasks, achieved by managing the trade-off of dependencies with HSIC between different meta-path semantic spaces. In summary, the proposed HBHG integrates hierarchical bottleneck constraints within and between meta-paths. This integration serves to maximize the aggregation of relevant information while effectively compressing irrelevant information, thereby enhancing the quality of heterogeneous graph embedding. The effectiveness of HBHG was validated through performance and ablation experiments conducted on multiple datasets.

中文翻译:

异构图表示的层次瓶颈

异构图(HG)包含许多节点及其交互关系,可以对复杂系统进行建模,并为任务执行提供丰富的语义和结构信息。其中,HG代表性是基础和关键的组成部分。现有的HG表示方法主要利用图神经网络来获取沿各种元路径的节点的语义并将它们融合来表示节点。最流行的 HG 表示方法包括两个步骤:元路径内的语义信息提取和元路径之间的语义融合。然而,这些方法忽略了元路径内节点异构性以及元路径之间同时语义相关性的考虑。具体来说,元路径内的节点异构性意味着基于元路径的邻居并不总是包含对目标节点产生积极影响的信息,并且元路径之间的语义相关性表明不同的元路径空间并不完全独立。忽视这些问题中的任何一个都会导致不相关或冗余信息的传播以及 HG 嵌入的潜在破坏。因此,在本研究中,我们提出了 HBHG,它是异构图表示的层次瓶颈。 HBHG 主要采用信息瓶颈(IB)作为指导原则,限制元路径内和元路径之间不相关信息的传播,同时保留相关信息。 IB 的中心概念围绕着将模型学习视为相关信息的保存和不相关信息的压缩,通过互信息(MI)最小化输入和隐藏特征之间的依赖性并最大化隐藏特征和真实值之间的依赖性来实现。考虑到 MI 估计的复杂性,本文引入了一种新颖的依赖指数,即希尔伯特-施密特独立准则(HSIC),它易于计算。具体来说,HBHG 包括两个主要组件:元路径内的语义瓶颈和元路径之间的语义瓶颈。元路径内的语义瓶颈主要依赖于图神经网络不同层对各种元路径的依赖关系的基于 HSIC 的限制,从而最大限度地从相邻节点中提取与目标节点相关的信息。元路径之间的语义瓶颈使得基于下游任务的语义信息的灵活提取和融合成为可能,这是通过管理不同元路径语义空间之间与 HSIC 的依赖关系的权衡来实现的。总之,所提出的 HBHG 集成了元路径内部和之间的分层瓶颈约束。这种集成可以最大限度地聚合相关信息,同时有效压缩不相关信息,从而提高异构图嵌入的质量。 HBHG 的有效性通过在多个数据集上进行的性能和消融实验得到验证。
更新日期:2024-03-07
down
wechat
bug