当前位置: X-MOL 学术Sci. China Inf. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel graph oversampling framework for node classification in class-imbalanced graphs
Science China Information Sciences ( IF 8.8 ) Pub Date : 2024-04-15 , DOI: 10.1007/s11432-023-3897-2
Riting Xia , Chunxu Zhang , Yan Zhang , Xueyan Liu , Bo Yang

Graph neural network (GNN) is a promising method to analyze graphs. Most existing GNNs adopt the class-balanced assumption, which cannot deal with class-imbalanced graphs well. The oversampling technique is effective in alleviating class-imbalanced problems. However, most graph oversampling methods generate synthetic minority nodes and their edges after applying GNNs. They ignore the problem that the representations of the original and synthetic minority nodes are dominated by majority nodes caused by aggregating neighbor information through GNN before oversampling. In this paper, we propose a novel graph oversampling framework, termed distribution alignment-based oversampling for node classification in class-imbalanced graphs (named Graph-DAO). Our framework generates synthetic minority nodes before GNN to avoid the dominance of majority nodes caused by message passing in GNNs. Additionally, we introduce a distribution alignment method based on the sum-product network to learn more information about minority nodes. To our best knowledge, it is the first to use the sum-product network to solve the class-imbalanced problem in node classification. A large number of experiments on four real datasets show that our method achieves the optimal results on the node classification task for class-imbalanced graphs.



中文翻译:

一种新颖的图过采样框架,用于类不平衡图中的节点分类

图神经网络(GNN)是一种很有前景的图分析方法。大多数现有的 GNN 采用类平衡假设,不能很好地处理类不平衡图。过采样技术可以有效缓解类不平衡问题。然而,大多数图过采样方法在应用 GNN 后会生成合成少数节点及其边。他们忽略了在过采样之前通过 GNN 聚合邻居信息而导致的原始少数节点和合成少数节点的表示以多数节点为主的问题。在本文中,我们提出了一种新颖的图过采样框架,称为基于分布对齐的过采样,用于类不平衡图中的节点分类(称为 Graph-DAO)。我们的框架在 GNN 之前生成合成少数节点,以避免 GNN 中消息传递导致多数节点占主导地位。此外,我们引入了一种基于和积网络的分布对齐方法,以了解有关少数节点的更多信息。据我们所知,它是第一个使用和积网络来解决节点分类中的类不平衡问题的。在四个真实数据集上的大量实验表明,我们的方法在类不平衡图的节点分类任务上取得了最优结果。

更新日期:2024-04-19
down
wechat
bug