当前位置: X-MOL 学术ACM Trans. Knowl. Discov. Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing Out-of-distribution Generalization on Graphs via Causal Attention Learning
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2024-03-26 , DOI: 10.1145/3644392
Yongduo Sui 1 , Wenyu Mao 1 , Shuyao Wang 1 , Xiang Wang 2 , Jiancan Wu 1 , Xiangnan He 2 , Tat-Seng Chua 3
Affiliation  

In graph classification, attention- and pooling-based graph neural networks (GNNs) predominate to extract salient features from the input graph and support the prediction. They mostly follow the paradigm of “learning to attend,” which maximizes the mutual information between the attended graph and the ground-truth label. However, this paradigm causes GNN classifiers to indiscriminately absorb all statistical correlations between input features and labels in the training data without distinguishing the causal and noncausal effects of features. Rather than emphasizing causal features, the attended graphs tend to rely on noncausal features as shortcuts to predictions. These shortcut features may easily change outside the training distribution, thereby leading to poor generalization for GNN classifiers. In this article, we take a causal view on GNN modeling. Under our causal assumption, the shortcut feature serves as a confounder between the causal feature and prediction. It misleads the classifier into learning spurious correlations that facilitate prediction in in-distribution (ID) test evaluation while causing significant performance drop in out-of-distribution (OOD) test data. To address this issue, we employ the backdoor adjustment from causal theory—combining each causal feature with various shortcut features, to identify causal patterns and mitigate the confounding effect. Specifically, we employ attention modules to estimate the causal and shortcut features of the input graph. Then, a memory bank collects the estimated shortcut features, enhancing the diversity of shortcut features for combination. Simultaneously, we apply the prototype strategy to improve the consistency of intra-class causal features. We term our method as CAL+, which can promote stable relationships between causal estimation and prediction, regardless of distribution changes. Extensive experiments on synthetic and real-world OOD benchmarks demonstrate our method’s effectiveness in improving OOD generalization. Our codes are released at https://github.com/shuyao-wang/CAL-plus.



中文翻译:

通过因果注意力学习增强图的分布外泛化

在图分类中,基于注意力和池化的图神经网络(GNN)主要从输入图中提取显着特征并支持预测。他们大多遵循“学习参与”的范式,最大限度地提高参与图和真实标签之间的相互信息。然而,这种范式导致 GNN 分类器不加区别地吸收训练数据中输入特征和标签之间的所有统计相关性,而不区分特征的因果效应和非因果效应。关注图倾向于依赖非因果特征作为预测的捷径,而不是强调因果特征。这些快捷特征可能很容易在训练分布之外发生变化,从而导致 GNN 分类器的泛化能力较差。在本文中,我们对 GNN 建模采取因果观点。在我们的因果假设下,快捷特征充当因果特征和预测之间的混杂因素。它会误导分类器学习虚假相关性,从而促进分布内 (ID) 测试评估中的预测,同时导致分布外 (OOD) 测试数据的性能显着下降。为了解决这个问题,我们采用因果理论的后门调整——将每个因果特征与各种快捷特征相结合,以识别因果模式并减轻混杂效应。具体来说,我们采用注意力模块来估计输入图的因果特征和快捷特征。然后,记忆库收集估计的快捷特征,增强组合快捷特征的多样性。同时,我们应用原型策略来提高类内因果特征的一致性。我们将我们的方法称为 CAL+,无论分布如何变化,它都可以促进因果估计和预测之间的稳定关系。对合成和现实世界 OOD 基准的大量实验证明了我们的方法在提高 OOD 泛化方面的有效性。我们的代码发布在https://github.com/shuyao-wang/CAL-plus。

更新日期:2024-03-26
down
wechat
bug