Skip to main content
Log in

SimGCL: graph contrastive learning by finding homophily in heterophily

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Graph Contrastive learning (GCL) has been widely studied in unsupervised graph representation learning. Most existing GCL methods focus on modeling the invariances of identical instances in different augmented views of a graph and using the Graph Neural Network (GNN) as the underlying encoder to generate node representations. GNNs generally learn node representations by aggregating information from their neighbors, where homophily and heterophily in the graph can strongly affect the performance of GNNs. Existing GCL methods neglect the effect of homophily/heterophily in graphs, resulting in sub-optimal learned representations of graphs with more complex patterns, especially in the case of high heterophily. We propose a novel Similarity-based Graph Contrastive Learning model (SimGCL), which generates augmented views with a higher homophily ratio at the topology level by adding or removing edges. We treat dimension-wise features as weak labels and introduce a new similarity metric based on feature and feature dimension-wise distribution patterns as a guide to improving homophily in an unsupervised manner. To preserve node diversity in augmented views, we retain feature dimensions with higher heterophily to amplify the differences between nodes in augmented views at the feature level. We also use the proposed similarity in the negative sampling process to eliminate possible false negative samples. We conduct extensive experiments comparing our model with ten baseline methods on seven benchmark datasets. Experimental results show that SimGCL significantly outperforms the state-of-the-art GCL methods on both homophilic and heterophilic graphs and brings more than 10% improvement on heterophilic graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. We adopted node labels to calculate the local assortativity following Eq. 1.

  2. The datasets in this paper are all integer features.

References

  1. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

  2. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903

  3. Chen F, Wang Y-C, Wang B, Kuo C-CJ (2020) Graph representation learning: a survey. APSIPA Trans Signal Inf Process 9

  4. Li C, Goldwasser D (2019) Encoding social information with graph convolutional networks forpolitical perspective detection in news media. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 2594–2604

  5. Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J (2018) Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 974–983

  6. Fout A, Byrd J, Shariat B, Ben-Hur A (2017) Protein interface prediction using graph convolutional networks. Adv Neural Inf Process Syst 30

  7. Xie Y, Xu Z, Zhang J, Wang Z, Ji S (2022) Self-supervised learning of graph neural networks: a unified review. IEEE Trans Pattern Anal Mach Intell

  8. Bo D, Wang X, Shi C, Shen H (2021) Beyond low-frequency information in graph convolutional networks. arXiv preprint arXiv:2101.00797

  9. Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2020) A survey on contrastive self-supervised learning. Technologies 9(1):2

    Article  Google Scholar 

  10. Zhu Y, Xu Y, Liu Q, Wu S (2021) An empirical study of graph contrastive learning. In: Thirty-fifth conference on neural information processing systems

  11. Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2021) Graph contrastive learning with adaptive augmentation. In: Proceedings of the web conference 2021, pp 2069–2080

  12. Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2020) Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131

  13. McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444

    Article  Google Scholar 

  14. Zhu J, Yan Y, Zhao L, Heimann M, Akoglu L, Koutra D (2020) Beyond homophily in graph neural networks: current limitations and effective designs. Adv Neural Inf Process Syst 33:7793–7804

    Google Scholar 

  15. Zheng X, Liu Y, Pan S, Zhang M, Jin D, Yu PS (2022) Graph neural networks for graphs with heterophily: a survey. arXiv preprint arXiv:2202.07082

  16. Suresh S, Budde V, Neville J, Li P, Ma J (2021) Breaking the limit of graph neural networks by improving the assortativity of graphs with local mixing patterns. arXiv preprint arXiv:2106.06586

  17. Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K-I, Jegelka S (2018) Representation learning on graphs with jumping knowledge networks. In: International conference on machine learning, pp 5453–5462. PMLR

  18. Li X, Zhu R, Cheng Y, Shan C, Luo S, Li D, Qian W (2022) Finding global homophily in graph neural networks when meeting heterophily. arXiv preprint arXiv:2205.07308

  19. Lee N, Lee J, Park C (2021) Augmentation-free self-supervised learning on graphs. arXiv preprint arXiv:2112.02472

  20. Velickovic P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2019) Deep graph infomax. ICLR (Poster) 2(3):4

  21. Xia J, Wu L, Wang G, Chen J, Li SZ (2022) Progcl: rethinking hard negative mining in graph contrastive learning. In: International conference on machine learning, pp 24332–24346. PMLR

  22. Peel L, Delvenne J-C, Lambiotte R (2018) Multiscale mixing patterns in networks. Proc Natl Acad Sci 115(16):4057–4062

    Article  ADS  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  23. Yang L, Li M, Liu L, Wang C, Cao X, Guo Y et al (2021) Diverse message passing for attribute with heterophily. Adv Neural Inf Process Syst 34:4751–4763

    Google Scholar 

  24. Namata G, London B, Getoor L, Huang B, Edu U (2012) Query-driven active surveying for collective classification. In: 10th international workshop on mining and learning with graphs, vol. 8, p 1

  25. Pei H, Wei B, Chang KC-C, Lei Y, Yang B (2020) Geom-GCN: geometric graph convolutional networks. arXiv preprint arXiv:2002.05287

  26. Peng Z, Huang W, Luo M, Zheng Q, Rong Y, Xu T, Huang J (2020) Graph representation learning via graphical mutual information maximization. In: Proceedings of the web conference 2020, pp 259–270

  27. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR

  28. Hassani K, Khasahmadi AH (2020) Contrastive multi-view representation learning on graphs. In: International conference on machine learning, pp 4116–4126. PMLR

  29. Qiu J, Chen Q, Dong Y, Zhang J, Yang H, Ding M, Wang K, Tang J (2020) GCC: graph contrastive coding for graph neural network pre-training. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1150–1160

  30. You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y (2020) Graph contrastive learning with augmentations. Adv Neural Inf Process Syst 33:5812–5823

    Google Scholar 

  31. Wang M, Yuan J, Qian Q, Wang Z, Li H (2022) Semantic data augmentation based distance metric learning for domain generalization. In: Proceedings of the 30th ACM international conference on multimedia, pp 3214–3223

  32. You Y, Chen T, Shen Y, Wang Z (2021) Graph contrastive learning automated. In: International conference on machine learning, pp 12121–12132. PMLR

  33. Suresh S, Li P, Hao C, Neville J (2021) Adversarial graph augmentation to improve graph contrastive learning. Adv Neural Inf Process Syst 34

  34. Yin Y, Wang Q, Huang S, Xiong H, Zhang X (2021) Autogcl: automated graph contrastive learning via learnable view generators. arXiv preprint arXiv:2109.10259

  35. Chuang C-Y, Robinson J, Lin Y-C, Torralba A, Jegelka S (2020) Debiased contrastive learning. Adv Neural Inf Process Syst 33:8765–8775

    Google Scholar 

  36. Zhao H, Yang X, Wang Z, Yang E, Deng C (2021) Graph debiased contrastive learning with joint representation clustering. In: Proceedings of IJCAI, pp 3434–3440

  37. Lin S, Zhou P, Hu Z-Y, Wang S, Zhao R, Zheng Y, Lin L, Xing E, Liang X (2021) Prototypical graph contrastive learning. arXiv preprint arXiv:2106.09645

  38. Hoffmann DT, Behrmann N, Gall J, Brox T, Noroozi M (2022) Ranking info noise contrastive estimation: boosting contrastive learning via ranked positives. In: AAAI conference on artificial intelligence

  39. Klicpera J, Bojchevski A, Günnemann S (2018) Predict then propagate: graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997

  40. Abu-El-Haija S, Perozzi B, Kapoor A, Alipourfard N, Lerman K, Harutyunyan H, Ver Steeg G, Galstyan A (2019) Mixhop: higher-order graph convolutional architectures via sparsified neighborhood mixing. In: International conference on machine learning, pp 21–29. PMLR

  41. Hou Y, Zhang J, Cheng J, Ma K, Ma RT, Chen H, Yang M-C (2019) Measuring and improving the use of graph information in graph neural networks. In: International conference on learning representations

  42. Rong Y, Huang W, Xu T, Huang J (2019) Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903

  43. Feng W, Zhang J, Dong Y, Han Y, Luan H, Xu Q, Yang Q, Kharlamov E, Tang J (2020) Graph random neural network for semi-supervised learning on graphs. arXiv preprint arXiv:2005.11079

  44. Chien E, Peng J, Li P, Milenkovic O (2020) Adaptive universal generalized pagerank graph neural network. arXiv preprint arXiv:2006.07988

  45. Anonymous: Graph contrastive learning under heterophily: Utilizing graph filters to generate graph views. In: Submitted to the eleventh international conference on learning representations (2023). under review. https://openreview.net/forum?id=NzcUQuhEGef

  46. Tian Y, Sun C, Poole B, Krishnan D, Schmid C, Isola P (2020) What makes for good views for contrastive learning? Adv Neural Inf Process Syst 33:6827–6839

    Google Scholar 

  47. Chen D, Lin Y, Li W, Li P, Zhou J, Sun X (2020) Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp 3438–3445

  48. Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748

  49. Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710

  50. Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864

  51. Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308

  52. Thakoor S, Tallec C, Azar MG, Munos R, Veličković P, Valko M (2021) Bootstrapped representation learning on graphs. In: ICLR 2021 Workshop on geometrical and topological representation learning

  53. Wang H, Zhang J, Zhu Q, Huang W (2022) Augmentation-free graph contrastive learning. arXiv preprint arXiv:2204.04874

Download references

Author information

Authors and Affiliations

Authors

Contributions

CL and CY wrote the main manuscript text and prepared figures. NG, SD, and ZY revised and polished the article. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ning Gui.

Ethics declarations

Conflict of interest

The authors declare no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, C., Yu, C., Gui, N. et al. SimGCL: graph contrastive learning by finding homophily in heterophily. Knowl Inf Syst 66, 2089–2114 (2024). https://doi.org/10.1007/s10115-023-02022-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-02022-1

Keywords

Navigation