Abstract
Graph Contrastive learning (GCL) has been widely studied in unsupervised graph representation learning. Most existing GCL methods focus on modeling the invariances of identical instances in different augmented views of a graph and using the Graph Neural Network (GNN) as the underlying encoder to generate node representations. GNNs generally learn node representations by aggregating information from their neighbors, where homophily and heterophily in the graph can strongly affect the performance of GNNs. Existing GCL methods neglect the effect of homophily/heterophily in graphs, resulting in sub-optimal learned representations of graphs with more complex patterns, especially in the case of high heterophily. We propose a novel Similarity-based Graph Contrastive Learning model (SimGCL), which generates augmented views with a higher homophily ratio at the topology level by adding or removing edges. We treat dimension-wise features as weak labels and introduce a new similarity metric based on feature and feature dimension-wise distribution patterns as a guide to improving homophily in an unsupervised manner. To preserve node diversity in augmented views, we retain feature dimensions with higher heterophily to amplify the differences between nodes in augmented views at the feature level. We also use the proposed similarity in the negative sampling process to eliminate possible false negative samples. We conduct extensive experiments comparing our model with ten baseline methods on seven benchmark datasets. Experimental results show that SimGCL significantly outperforms the state-of-the-art GCL methods on both homophilic and heterophilic graphs and brings more than 10% improvement on heterophilic graphs.
Similar content being viewed by others
Notes
We adopted node labels to calculate the local assortativity following Eq. 1.
The datasets in this paper are all integer features.
References
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903
Chen F, Wang Y-C, Wang B, Kuo C-CJ (2020) Graph representation learning: a survey. APSIPA Trans Signal Inf Process 9
Li C, Goldwasser D (2019) Encoding social information with graph convolutional networks forpolitical perspective detection in news media. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 2594–2604
Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J (2018) Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 974–983
Fout A, Byrd J, Shariat B, Ben-Hur A (2017) Protein interface prediction using graph convolutional networks. Adv Neural Inf Process Syst 30
Xie Y, Xu Z, Zhang J, Wang Z, Ji S (2022) Self-supervised learning of graph neural networks: a unified review. IEEE Trans Pattern Anal Mach Intell
Bo D, Wang X, Shi C, Shen H (2021) Beyond low-frequency information in graph convolutional networks. arXiv preprint arXiv:2101.00797
Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2020) A survey on contrastive self-supervised learning. Technologies 9(1):2
Zhu Y, Xu Y, Liu Q, Wu S (2021) An empirical study of graph contrastive learning. In: Thirty-fifth conference on neural information processing systems
Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2021) Graph contrastive learning with adaptive augmentation. In: Proceedings of the web conference 2021, pp 2069–2080
Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2020) Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444
Zhu J, Yan Y, Zhao L, Heimann M, Akoglu L, Koutra D (2020) Beyond homophily in graph neural networks: current limitations and effective designs. Adv Neural Inf Process Syst 33:7793–7804
Zheng X, Liu Y, Pan S, Zhang M, Jin D, Yu PS (2022) Graph neural networks for graphs with heterophily: a survey. arXiv preprint arXiv:2202.07082
Suresh S, Budde V, Neville J, Li P, Ma J (2021) Breaking the limit of graph neural networks by improving the assortativity of graphs with local mixing patterns. arXiv preprint arXiv:2106.06586
Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K-I, Jegelka S (2018) Representation learning on graphs with jumping knowledge networks. In: International conference on machine learning, pp 5453–5462. PMLR
Li X, Zhu R, Cheng Y, Shan C, Luo S, Li D, Qian W (2022) Finding global homophily in graph neural networks when meeting heterophily. arXiv preprint arXiv:2205.07308
Lee N, Lee J, Park C (2021) Augmentation-free self-supervised learning on graphs. arXiv preprint arXiv:2112.02472
Velickovic P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2019) Deep graph infomax. ICLR (Poster) 2(3):4
Xia J, Wu L, Wang G, Chen J, Li SZ (2022) Progcl: rethinking hard negative mining in graph contrastive learning. In: International conference on machine learning, pp 24332–24346. PMLR
Peel L, Delvenne J-C, Lambiotte R (2018) Multiscale mixing patterns in networks. Proc Natl Acad Sci 115(16):4057–4062
Yang L, Li M, Liu L, Wang C, Cao X, Guo Y et al (2021) Diverse message passing for attribute with heterophily. Adv Neural Inf Process Syst 34:4751–4763
Namata G, London B, Getoor L, Huang B, Edu U (2012) Query-driven active surveying for collective classification. In: 10th international workshop on mining and learning with graphs, vol. 8, p 1
Pei H, Wei B, Chang KC-C, Lei Y, Yang B (2020) Geom-GCN: geometric graph convolutional networks. arXiv preprint arXiv:2002.05287
Peng Z, Huang W, Luo M, Zheng Q, Rong Y, Xu T, Huang J (2020) Graph representation learning via graphical mutual information maximization. In: Proceedings of the web conference 2020, pp 259–270
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR
Hassani K, Khasahmadi AH (2020) Contrastive multi-view representation learning on graphs. In: International conference on machine learning, pp 4116–4126. PMLR
Qiu J, Chen Q, Dong Y, Zhang J, Yang H, Ding M, Wang K, Tang J (2020) GCC: graph contrastive coding for graph neural network pre-training. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1150–1160
You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y (2020) Graph contrastive learning with augmentations. Adv Neural Inf Process Syst 33:5812–5823
Wang M, Yuan J, Qian Q, Wang Z, Li H (2022) Semantic data augmentation based distance metric learning for domain generalization. In: Proceedings of the 30th ACM international conference on multimedia, pp 3214–3223
You Y, Chen T, Shen Y, Wang Z (2021) Graph contrastive learning automated. In: International conference on machine learning, pp 12121–12132. PMLR
Suresh S, Li P, Hao C, Neville J (2021) Adversarial graph augmentation to improve graph contrastive learning. Adv Neural Inf Process Syst 34
Yin Y, Wang Q, Huang S, Xiong H, Zhang X (2021) Autogcl: automated graph contrastive learning via learnable view generators. arXiv preprint arXiv:2109.10259
Chuang C-Y, Robinson J, Lin Y-C, Torralba A, Jegelka S (2020) Debiased contrastive learning. Adv Neural Inf Process Syst 33:8765–8775
Zhao H, Yang X, Wang Z, Yang E, Deng C (2021) Graph debiased contrastive learning with joint representation clustering. In: Proceedings of IJCAI, pp 3434–3440
Lin S, Zhou P, Hu Z-Y, Wang S, Zhao R, Zheng Y, Lin L, Xing E, Liang X (2021) Prototypical graph contrastive learning. arXiv preprint arXiv:2106.09645
Hoffmann DT, Behrmann N, Gall J, Brox T, Noroozi M (2022) Ranking info noise contrastive estimation: boosting contrastive learning via ranked positives. In: AAAI conference on artificial intelligence
Klicpera J, Bojchevski A, Günnemann S (2018) Predict then propagate: graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997
Abu-El-Haija S, Perozzi B, Kapoor A, Alipourfard N, Lerman K, Harutyunyan H, Ver Steeg G, Galstyan A (2019) Mixhop: higher-order graph convolutional architectures via sparsified neighborhood mixing. In: International conference on machine learning, pp 21–29. PMLR
Hou Y, Zhang J, Cheng J, Ma K, Ma RT, Chen H, Yang M-C (2019) Measuring and improving the use of graph information in graph neural networks. In: International conference on learning representations
Rong Y, Huang W, Xu T, Huang J (2019) Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903
Feng W, Zhang J, Dong Y, Han Y, Luan H, Xu Q, Yang Q, Kharlamov E, Tang J (2020) Graph random neural network for semi-supervised learning on graphs. arXiv preprint arXiv:2005.11079
Chien E, Peng J, Li P, Milenkovic O (2020) Adaptive universal generalized pagerank graph neural network. arXiv preprint arXiv:2006.07988
Anonymous: Graph contrastive learning under heterophily: Utilizing graph filters to generate graph views. In: Submitted to the eleventh international conference on learning representations (2023). under review. https://openreview.net/forum?id=NzcUQuhEGef
Tian Y, Sun C, Poole B, Krishnan D, Schmid C, Isola P (2020) What makes for good views for contrastive learning? Adv Neural Inf Process Syst 33:6827–6839
Chen D, Lin Y, Li W, Li P, Zhou J, Sun X (2020) Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp 3438–3445
Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308
Thakoor S, Tallec C, Azar MG, Munos R, Veličković P, Valko M (2021) Bootstrapped representation learning on graphs. In: ICLR 2021 Workshop on geometrical and topological representation learning
Wang H, Zhang J, Zhu Q, Huang W (2022) Augmentation-free graph contrastive learning. arXiv preprint arXiv:2204.04874
Author information
Authors and Affiliations
Contributions
CL and CY wrote the main manuscript text and prepared figures. NG, SD, and ZY revised and polished the article. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, C., Yu, C., Gui, N. et al. SimGCL: graph contrastive learning by finding homophily in heterophily. Knowl Inf Syst 66, 2089–2114 (2024). https://doi.org/10.1007/s10115-023-02022-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-02022-1