research-article

Open Access

Contrastive Graph Similarity Networks

Authors:
Luzhi Wang

College of Intelligence and Computing, Tianjin University, China

College of Intelligence and Computing, Tianjin University, China

0000-0003-4131-7824
View Profile

,
Yizhen Zheng

Department of Data Science and AI, Faculty of IT, Monash University, Australia

Department of Data Science and AI, Faculty of IT, Monash University, Australia

0000-0002-3540-8845
View Profile

,
Di Jin

College of Intelligence and Computing, Tianjin University, China

College of Intelligence and Computing, Tianjin University, China

0000-0002-7445-9936
View Profile

,
Fuyi Li

College of Information Engineering, Northwest A&F University, China

College of Information Engineering, Northwest A&F University, China

0000-0001-5216-3213
View Profile

,
Yongliang Qiao

Australian Centre for Field Robotics, The University of Sydney, Australia

Australian Centre for Field Robotics, The University of Sydney, Australia

0000-0003-2142-0154
View Profile

,
Shirui Pan

School of Information and Communication Technology, Griffith University, Australia

School of Information and Communication Technology, Griffith University, Australia

0000-0003-0794-527X
View Profile

Authors Info & Claims

ACM Transactions on the Web Volume 18 Issue 2Article No.: 17pp 1–20https://doi.org/10.1145/3580511

Published:08 January 2024Publication History

ACM Transactions on the Web

Abstract

Graph similarity learning is a significant and fundamental issue in the theory and analysis of graphs, which has been applied in a variety of fields, including object tracking, recommender systems, similarity search, and so on. Recent methods for graph similarity learning that utilize deep learning typically share two deficiencies: (1) they leverage graph neural networks as backbones for learning graph representations but have not well captured the complex information inside data, and (2) they employ a cross-graph attention mechanism for graph similarity learning, which is computationally expensive. Taking these limitations into consideration, a method for graph similarity learning is devised in this study, namely, Contrastive Graph Similarity Network (CGSim). To enhance graph similarity learning, CGSim makes use of the complementary information of two input graphs and captures pairwise relations in a contrastive learning framework. By developing a dual contrastive learning module with a node-graph matching and a graph-graph matching mechanism, our method significantly reduces the quadratic time complexity for cross-graph interaction modeling to linear time complexity. Jointly learning in an end-to-end framework, the graph representation embedding module and the well-designed contrastive learning module can be beneficial to one another. A comprehensive series of experiments indicate that CGSim outperforms state-of-the-art baselines on six datasets and significantly reduces the computational cost, which demonstrates our CGSim model’s superiority over other baselines.

1 INTRODUCTION

Learning a function to calculate how similar two graphs are to each other is considered as the main objective of graph similarity learning (GSL). It is a prominent graph theory issue with numerous practical applications, including computer vision [35], programming code analysis [47], financial transaction analysis [22], protein-protein interaction alignment [5], and entity linking in knowledge graphs [46]. Numerous graph-matching techniques have been proposed to evaluate how similar two graphs are to each other. Some traditional methods for graph similarity computation, such as heuristic search methods and exactly search strategies, focus only on the graph topology. For instance, using random walks as its backbone, a graph matching study for finding exact graph matching is introduced by the work of Gori et al. [10]. By separating colliding graph signatures, Gori et al. [11] provide a search algorithm to locate full mappings between nodes that need to be matched. These techniques are designed to identify the node-to-node (N2N) complex correspondence between two given graphs.

However, exact graph matching, finding exactly search results for node correspondence in the graph, commonly referred to as the graph isomorphism problem or the subgraph isomorphism problem (to identify if a graph to a part of another graph), is computationally expensive and even NP-complete [13]. To solve this challenge, relaxed graph matching, which approximately computes the similarities, has drawn much attention. Hlaoui et al. [16] propose a search method that decomposes the matching process into several stages to find the smallest error mapping for the relaxed graph matching problem. This algorithm significantly reduces the search space and produces good approximate matches between graphs. Zhou et al. [58] formalize pair-wise graph matching into a quadratic assignment problem (QAP) and design a fast approximation algorithm by decomposing the affinity matrix, a matrix for structuring the similarity, into Kronecker products of some special and small matrices. Lagrangian relaxation graph matching (LRGM), a revolutionary and cost effective graph matching relaxation technique, is proposed by Jiang et al. [17]. LRGM makes an effort to offer frameworks for maximizing the relaxation matching goal, by including affine mapping constraints into the matching target. The aforementioned works focus on finding node-to-node matches on the graph topology. However, some inherent property features (e.g., node/edge weights) contained in graphs are overlooked, which limits the comparison of graph similarity computing.

Graph neural networks (GNNs) have emerged recently as viable methods for GSL. GNN methods [41] typically employ a GNN encoder for that each node in a graph is generally encoded into a latent space with low-dimension by capturing the topological structure information and node content (if exists). Therefore, GNN-based graph similarity learning methods can evaluate graph similarity more comprehensively than methods that only consider graph topology information [2]. Graph-level representations are taken as foundations into consideration, to measure the similarity score between the given two graphs. There are many ways to obtain a graph-level representation, and most works use a pooling layer to obtain graph-level embeddings. The pooling layer obtains a graph-level representation, which is a dimension-fixed single vector, by aggregating all node embeddings in the graph. The simplest pooling layer can be the mean pooling, which regards the graph-level representation as a fixed-dimension vector by averaging all node representations in the pooled graph. In addition to mean pooling, max pooling and LSTM pooling are the popular pooling methods for obtaining graph-level representations. However, learning the representation of every single graph is not sufficient to compare the difference well between two graphs. For this reason, some works refine the difference between two graphs by comparing nodes one by one in two input graphs. Specifically, to capture the correspondence between nodes across graphs, these methods employ a cross-graph attention mechanism for graph representation learning.

While achieving promising results, these methods typically suffer from two deficiencies. Limitation 1: the graph representation learning (GRL) module in these methods does not well capture the complex information for graph representation learning. For example, vanilla GNN encoders, such as gated graph neural network (GGNN) [27], Graph Convolutional Network (GCN) [24], graph attention network (GAT) [37], are designed for general graph representation learning and overlook pairwise relations between graphs. Moreover, the aforementioned works generally take a single graph as input, whereas GSL focuses on comparing a pair of graphs. Thus, instead of learning a pair of graphs separately, it is preferable to learn these graph representations simultaneously. Limitation 2: Computationally expensive. Learning the matching relationship between nodes is a key part of improving similarity learning. To determine how a node corresponds to other nodes in another graph, one node in a graph has to be attended to over all nodes in another graph in the existing methods (as shown in Figure 1(a)), resulting in a time complexity of \(O(|V_1||V_2|)\) for a single epoch in the cross-graph attention mechanism, where \(|V_1|\) and \(|V_2|\) are node numbers in two matched graphs [26]. It is challenging to apply this type of cross-graph matching mechanism to address real-world issues, since it is computationally expensive.

Fig. 1. Comparison of different cross-graph interaction mechanisms. (a) Existing methods employ a 1-versus-all attention for every node to capture the correspondence between this node and all other nodes in another graph, resulting in a complexity of \(O(|V_1||V_2|)\) . (b) Our method employs a 1-versus-1 mechanism, where each node will be paired with the whole graph for contrastive learning, resulting in a complexity of \(O(|V_1|+|V_2|)\) . \(|V_1|\) and \(|V_2|\) are node numbers in two matched graphs.

To address the aforementioned limitations, we suggest a novel method, Contrastive Graph Similarity network (CGSim), which makes use of contrastive learning to both enhance graph representation learning and reduce the computational complexity in pairwise matching. Contrastive learning, as an unsupervised learning framework, has achieved promising performance in diverse tasks such as natural language processing (NLP) [9], graph analysis on representation learning for sparse graphs [21], and computer vision (CV) [54]. The primary objective underlying contrastive learning is to create a pair of semantically similar instances from the data itself, thus, the mutual information between the data and the constructed similar instances can be maximized. By using contrastive learning, we can easily extract the agreement between two instances, that is, the similarity between them. Contrastive learning has shown its effectiveness in exploiting the complex information in (graph) data; however, all existing studies, such as References [14, 59], focus on building pairwise contrastive samples within a single graph. The cross-graph contrastiveness, which is the natural case for GSL, has not been exploited. Specifically, contrastiveness in existing studies is mostly built between a graph and its augmented graph, while in our study we build contrastiveness between two different graphs, which further takes the similarity between nodes into account for GSL.

As part of this work, we develop a new method for learning similarity between graphs based on cross-graph contrastive learning. To address Limitation 1 of existing approaches, our method directly captures the pairwise relations between graphs by exploiting the complementary information from the different graphs. In particular, we compute the similarity of a pair-wise node from the two different graphs using contrastive learning. To reduce the computational cost in matching, i.e., Limitation 2, we design our method as a dual contrastive learning framework, i.e., employing node-graph matching and graph-graph matching simultaneously. For node-graph matching, given two graphs \(G_1\) and \(G_2\), our method CGSim will create a node pair \((v, G_2)\) for every node v in \(G_1\) for contrastive learning. As CGSim replaces the computationally expensive 1-versus-all attention mechanism with a 1-versus-1 contrastive learning module, it significantly reduces the time complexity for cross-graph interaction modeling from \(O(|V_1||V_2|)\) to \(O(|V_1|+|V_2|)\), as shown in Figure 1(b). To provide a measure of the similarity of two graphs globally, we provide graph-graph matching, which uses graph-level representations to compute similarity. More specifically, our method maximizes the mutual information of the two representations from two similar graphs, which further enhances the representation learning for two graphs. Our graph representation learning and contrastive learning are learned in a unified framework so that both GSL and contrastive learning can be beneficial and enhance one another. In particular, the similarities between the two graphs are extracted, as well as the differences between them are expanded. Employing six real-world datasets for evaluation, we show that CGSim achieves consistent performance gains compared with baselines. We also provide complete ablation experiments for demonstrating the validity of our proposed cross-graph comparability. We provide runtime tests to demonstrate that our model alleviates the limitation introduced in Section 1 and has a lower runtime, which makes the GSL algorithm more applicable to the real world. We summarize contributions of this article as follows:

To obtain a finely vectorized representation of nodes, we integrate contrastive learning into the representation learning process of GSL. CGSim employs a cross-graph learning scheme to contrastively enhance representation learning.
To obtain a legible matching relationship, we propose a dual contrastive matching framework that employs node-graph matching and graph-graph matching simultaneously for GSL. CGSim significantly reduces the time complexity for cross-graph interaction modeling. Our proposed model differs significantly from existing approaches in this way.
We evaluate our algorithm with extensive experiments and compare it with various baselines. The results demonstrate that superiority of CGSim.

2 RELATED WORK

2.1 Pairwise Graph Similarity Computation

In graph theory, pairwise graph similarity calculation is a fundamental problem, which is employed to determine how similar a pair of graphs are. It benefits many real-world application tasks such as object tracking, recommender systems, binary code analysis, and similarity search. The establishment of similarity measures is adopted in early works. For example, Zager et al. [52] employ a linear update to obtain node similarity scores and edge similarity scores. Papadimitriou et al. [33] propose five similarity evaluation methods for web graph anomaly detection. These strategies have been widely adopted in the follow-up works [25]. However, the aforementioned techniques mainly rely on original graph architectures and they are learning-free, making it challenging to effectively utilize features provided by nodes and edges. Benefiting from the deep learning advancement, dramatic development has been made by graph similarity computation [40, 53]. Some GSL works rely on computer vision, and use encoders to extract graphs containing attributes from input images, then perform GSL, and design loss-optimized computer vision encoders. For example, Yu et al. [50] convert the Hungarian algorithm to a hard attention mechanism and incorporate a multi-headed attention mechanism for optimizing deep encoders in deep graph matching. An entire end-to-end pipeline for deep networks modeled by Wang et al. [39] is differentiable to parameterize the affinity functions within and across graphs to learn graph similarity. As the advancement of deep encoders inside the CV field, deep graph kernels have been the focus of some recent studies. Though learning a designed kernel function, deep graph kernels measure the similarity scores in substructure space among any two matched graphs [30].

Kernel-based techniques use kernel functions that correspond to the inner product of Reproducing Kernel Hilbert Space (RKHS) to analyze the similarity of two objects. The challenge for kernel-based techniques is to develop an appropriate kernel function that captures the information of the structures while facilitating computation. Yanardag et al. [48] propose a unified framework for learning potential representations of graph substructures, the DGK framework. The framework defines graph-to-graph similarity by using the information on dependencies between substructures to define the potential representation of graph structures by similarity. The Weisfeiler-Lehman (WL) subtree kernel is a commonly used graph kernel basis. Apart from that, the authors give special cases of other two common kernel application frameworks: Graphlet kernels and shortest path-based graph kernels. An unsupervised graph representation learning approach comes from Alrfou et al. [1], attempts to build a learnable encoder to capture the graph structure. A cross-graph attention network is trained for recording the interaction between two embeddings of each graph pair. They use the predictions of the attention-enhanced encoder to define the divergence scores of each pair of graphs. Finally, they use these pairwise divergence scores to construct an embedding space for all graphs, reducing the reliance on domain-specific knowledge (such as, on the WL kernel).

Different from deep graph kernel methods, GNN-based methods are usually based on graph representation learning. GNN-based methods usually use a GNN as the base encoder. Then a cross-graph attention mechanism is performed by GNN-based methods to learn the extent of the nodes matched with each other between two matching graphs [19]. Concretely, SimGNN [2] combines two strategies. First, to offer global information about a graph, SimGNN constructs an embedding function for transforming the input graph pairs into an embedding space. Second, a pairwise node-level comparison framework is created by SimGNN to supplement graph embeddings with fine-grained node information. Finally, the GNN encoder is optimized through a loss carefully designed for the graph similarity problem. In GMN [26], they propose a GSL framework, which receives graph pairs as inputs. By using a matching mechanism based on a cross-graph attention strategy, the GMN framework then calculates the pairwise graph’s similarity score. The matching mechanism then does joint inference on the pair. Afterward, considering that these graph neural networks are limited by the fixed dimension of graph representation, GraphSim [3] finds that the fixed-dimensional graph-level representations may not fully capture different graphs sizes and structures. Therefore, GraphSim designs a node multi-scale graph representation for similarity computation to find fine-grained differences between two graphs. GraphSim breaks the idea of fixed-dimensional vectors, and directly matches two groups of node embeddings to represent the entire graph. MGMN [28] finds that recent works on GSL mainly consider graph-to-graph (G2G) or N2N interactions while ignoring the cross-level interactions (such as, node-to-graph (N2G) interaction information). With end-to-end computation of graph similarity between the two graphs, MGMN suggests a way for effectively learning cross-level graph interactions between nodes in the two different graphs. Unfortunately, the aforementioned models have a common limitation in cross-graph attention mechanism, which is computationally expensive.

2.2 Graph Representation Learning (GRL)

Recently, GRL focuses on learning graph representations by utilising both the topology and attributed information of graphs [18, 51]. It has been applied in many real-world scenarios, such as recommender systems (RSs) and social networks (SNs) [20]. In early works, DeepWalk [34] vectorizes the graph through a random walk algorithm with few network annotation nodes to learn social network embeddings. The key idea of it is inspired by the Skip-gram method of word2vec in NLP [32]. To handle large-scale graph data, LINE [36] maps all of the network’s nodes into an embedding space and then attempts to preserve the network’s original structure in terms of first-degree and second-degree links. The representation of the network nodes is then modeled by processing the word vectors. GNNs have garnered a fair amount of attention as a result of the advancement of deep graph learning [43, 44, 55]. For instance, a technological architectural domain message-passing method is the GGNN, which takes the Gate Recurrent Unit (GRU) as the foundations. Three categories of processing make up the general methodology of message passing: message passing, update, and read operation. The node’s embedding in the next moment is determined by its current moment embedding, as well as the node’s neighbor’s current moment embeddings and the edges information between the two nodes interacting. GCN [24] aggregates the node features with the central node’s neighbors through a weighted average function to create a new representation of the node in the spectral domain. Subsequently, there are many variants methods of GCN have been studied, for example, GAT [37], graph isomorphism network (GIN) [45]. GAT provides a self-attentive mechanism for aggregating neighbor node information to improve the method’s accuracy, resulting in the adaptive matching of weights to different neighbors. GAT can give various weights to different nodes in the neighborhood without any kind of expensive matrix operations (e.g., inversion) or depending on the pre-knowledge of the graph structure. GIN analyzes GNNs based on graph isomorphism theory, and proves that the WL test is the upper bound of GNN performance. Graph isomorphism is also an important indicator of GSL and the Weifeiler-Lehman test is a powerful algorithm for distinguishing graph structures and can discriminate whether a graph is isomorphic or not. GIN demonstrates that it has equivalent expressive/discriminative abilities to the WL test.

2.3 Contrastive Learning

In self-supervised learning, contrastive learning is an emerging technology that cannot be ignored and originates from computer vision. Contrastive learning compares minute variations across samples in the embedding space for differentiating representation of samples. Recent years have seen a lot of fascinating works. For example, MOCO [15] translates contrastive learning into a dictionary lookup system, where they construct dynamic dictionaries with queues and average-shift encoders. This allows the construction of large and consistent dictionaries to facilitate contrastive unsupervised learning. SimCLR [6] simplifies recently proposed contrastive self-supervised learning algorithms without the need for specialized architectures or memory banks. SimCLR demonstrates that developing effective prediction tasks depends heavily on data augmentation and introduces learnable nonlinear transformations between representations and contrastive losses, which greatly contributes to the quality of learned representations. This is not only true for color distortion but also for other types of data transformations. In general, contrastive training is more sensitive to systematic bias in the data. In machine learning, data bias is a widespread problem, which has a greater impact on contrastive methods (e.g., color distortion). Grill et al. [12] propose BYOL, a contrastive learning method that can not rely on negative sampling and thus avoids the data bias problem. BYOL does not care whether different samples have different representations (i.e., the contrast part of contrastive learning), but simply makes similar samples similarly represented. It may seem inconsequential, but such a setup will significantly improve model training efficiency and generalization. In training, each sample is only sampled once per traversal, and there is no need to focus on negative samples. Since no negative sampling is required, BYOL has higher training efficiency. The BYOL model is insensitive to the systematic bias of the training data, which means that BYOL can have better applicability to unseen samples as well. Barlow Twins [54] doesn’t take into account many complex tricks, such as data augmentation, negative samples, networks of momentum update, predictor and stop gradient operations. Barlow Twins tries to learn the representation from a different perspective, starting from the embedding itself, rather than from the samples. The optimization goal is to make the correlation matrix of the features in different perspectives close to the constant matrix, i.e., making the features in different dimensions represent different information as much as possible can improve the representational power of the features. Given the excellent performance of contrastive learning, some work has introduced contrastive learning into the supervised models, and excellent results have been obtained. For example, to use label information efficiently, some researchers, i.e., Khosla et al. [23], recently adapts a self-supervised contrastive learning method for properly using in a fully supervised setting. In particular, nodes with the same label are brought together while clusters of data from distinct categories are pushed apart in the embedding space.

In addition to computer vision, contrastive learning is an emerging approach for GRL and it works for self-supervised learning [56, 57]. In contrastive learning, the principal objective is to continue improving the mutual information between the two semantically similar instances that have been constructed from the data itself. Contrastive learning reaches state-of-the-art (SOTA) performance [6, 29, 42, 56] and has been applied to graph data. For instance, DGI [38] maximizes the mutual information of local information and global information to enhance representations. By discriminating between the local-global representations, MVGRL [14] is presented in a new way by using contrastive multi-view representation learning for the graph representation learning underlying the DGI idea. GraphCL [49] proposes a graph-level contrastive learning method with four types of graph augmentation technologies. However, all contrastive-learning-based methods are only designed for a single graph representation. This scheme is not suitable for GSL, which normally takes two graphs as input to compare the difference. To fill this gap, we propose CGSim, which can build contrastiveness between different graphs.

3 METHOD

Beginning with a definition of the graph similarity problem, this section introduces our proposed method CGSim, followed by the elaboration of each critical component. We list the key terms and notations in Table 1.

Table 1.

Notations	Descriptions
G	A graph
\(V, E, X\)	Node sets, edge sets and feature sets
\(h_v\)	Embedding of a node v
\(h_G\)	Embedding of a graph G
l	lth layer of neural network
\(\tau\)	Temperature parameter
\(\|V\|\)	Cardinality of a node set V
\(N(v)\)	Neighbor of a node v
\(AGG(\cdot)\)	Aggregate function
\(Trans(\cdot)\)	Transform function
\(AVG(\cdot)\)	Average function
\(\theta (\cdot)\)	Cosine similarity function
\(\mathbb {E}(\cdot)\)	Exponential function
\(BCE(\cdot)\)	Binary cross entropy loss

View Table

Table 1. Summary of Notations

3.1 Problem Definition

Suppose that \(G =(V, E, X)\) be an undirected graph with a node set V and an edge set E, where \(E \subseteq V \times V\). \(|\cdot |\) is denoted as the cardinality of numbers. Specifically, \(|V|\) represents the number of nodes. \(X \in \mathbb {R}^{|V| \times d}\) denotes features of nodes, where d is features’ dimension. The similarity scores between two input graphs are defined in terms of the following:

Definition 3.1

(Graph Similarity Score).

Given two graphs \(G_1 = (V_1,E_1)\), \(G_2 = (V_2,E_2)\) as inputs, and their embeddings \(h_{G_1}\) and \(h_{G_2}\) are used to calculate the cosine similarity score \(\theta (h_{G_1},h_{G_2})\) between two graphs: (1) \(\begin{equation} \theta (h_{G_1},h_{G_2}) = \frac{h_{G_1} \cdot h_{G_2}}{\parallel h_{G_1}\parallel \cdot \parallel h_{G_2}\parallel }. \end{equation}\)

Our object is to train a learnable GNN encoder that regards a pair of graphs as input, and outputs their low-dimensional representations for similarity computation. The trained encoder can then be applied to any unseen graph pairs for similarity calculation during the inference phase. The calculated two graph cosine similarity scores are used in subsequent downstream tasks related to graph similarity.

3.2 Solution Overview

According to the two limitations we mentioned in Section 1, we design the CGSim foundation upon the following intuitions:

Intuition 1. Responsibility of Representations. The graph similarity computation is conducted based on the graph embeddings of input pairs. To create unified dimensional representations that may be used to quantify similarity, graph pairs need to be first mapped to the same low-dimensional space to get unified dimensional representations. Thus, learning representations that well persist in the original graph’s complexity information is essential to the success of this measurement. Graph-level representations are pooled from node-level representations. Therefore, using GNN to learn complex node-level representations to fully express the properties of nodes is one of the key designs in our similarity learning.

Intuition 2. Capability of Cross-graph Interaction. One of the important methods for improving the computation of graph similarity scores is to establish the correspondence between the nodes in two graphs. To enable effective graph matching, a cross-graph interaction module is required to exploit the pairwise relations for graph representation learning. As far as we know, current works normally employ attention mechanisms for cross-graph interactions between nodes. Specifically, each node in one graph needs to perform attention calculations with all nodes in the other graph. The time complexity of these cross-graph attention mechanisms is \(O(|V_1||V_2|)\), which is not conducive to application in real-world datasets. This is because learning the 1-versus-all correspondence (i.e., attending all nodes in another graph with a target node to discover the correspondence) is computationally expensive. The ideal cross-graph interaction module should have a low-computational cost as well as a high representation capability. Contrastive learning, which naturally takes two graphs as input, is a promising solution for cross-graph interaction.

Overall Framework. Following the intuitions, we combine graph representation learning with contrastive learning in a unified framework to learn graph similarity. The overview of our proposed model CGSim, which consists of four components: (a) input graphs, (b) GNN module, (c) cross-graph contrastive learning module, (d) combinatorial loss module, is shown in Figure 2. To train our model, we first input the graph pairs into a GNN module, which outputs node-level embeddings. By aggregating node embeddings, a graph pooling technique can generate embeddings at the graph level. Then, we construct a dual contrastive learning framework with a node-graph matching and a graph-graph matching scheme utilizing the generated different level embeddings. Besides, we refine the generated embeddings by further exploiting the supervision signals with the BCE loss. We jointly train a combinational loss that consists of the aforementioned parts (i.e., node-graph contrastiveness, graph-graph contrastiveness, and cross-entropy loss), so that each component can be beneficial to another. We will discuss more detailedly about these components of CGSim in the following sections.

Fig. 2. Overview of CGSim. The input graph pairs in module (a) are sent to the GNN module (b) to capture the node-level representations by a shared encoder. The node-level representations are dubbed the input of module (c). In module (c), graph pooling can be used to obtain graph-level representations. Then, both node-level representations, as well as graph-level representations, will be sent to a carefully designed cross-graph contrastive learning module to achieve cross-graph interaction by node-graph matching and graph-graph matching mechanisms. To refine the generated representations, supervision signals are further exploited with a cross-entropy loss. By penalizing the discrepancies between the predicted and ground-truth graph similarity scores, the BCE loss is calculated. The whole model is trained using the combinatorial loss module (d), which constitutes the node-graph matching loss, graph-graph matching loss, and cross-entropy loss collaboratively.

3.3 Representation Learning

As shown in Figure 2(a) and 2(b), the first two components of our suggested method (input graphs and GNN module) are illustrated in this section. A siamese network is designed for the purpose of distinguishing visual differences in computer vision by measuring the similarity between two input images [4]. Two inputs are supplied into two neural networks of a siamese neural network. The inputs are mapped to the new space by the two neural networks with the same weight. The similarity of the two inputs is measured by the loss calculation. Generally, a siamese network includes two neural networks with the same parameters, which leads to more robust semantic similarity learning. The architecture of the siamese network natural fits for GSL, which can introduce inductive biases for identifying invariant patterns of similar objects [7, 8, 31]. In our method, we adopt a siamese network architecture as our backbone. Specifically, in our method, the GNN module is made up of two GNN encoders with shared learnable weights. To estimate the matching score of two graphs, we first need to embed two graphs into the vector space. Here, we employ a GNN encoder to acquire node representations by collecting nodes’ local structure information. The GNN encoder may be considered a feature extractor for graphs, which mainly consists of message aggregation and transformation. Specifically, GNN updates node representations iteratively by aggregating its neighboring representations. The whole process is defined as (2) \(\begin{equation} \begin{aligned}h_v^{(l)} &= \texttt {Agg}^{(l)}\left(h_u^{(l-1)} :u\in N(v)\right),\\ h_v^{(l)} &=\texttt {Trans}^{(l)}\left(h_v^{(l-1)}, h_v^{(l)}\right), \end{aligned} \end{equation}\) where \(h_v^{(l)}\) is the representation vector of a node v at the lth layer and \(N(v)\) represents a node sets adjacent to v [45]. Agg(\(\cdot\)) is an aggregate function that aggregates the neighbors’ information to the node itself and Trans(\(\cdot\)) is the transform function that transforms the message to update a new embedding. Then, the graph-level representation \(h_G\) can be obtained via mean-pooling, denoted as Avg(\(\cdot\)) below: (3) \(\begin{equation} h_G = \texttt {Avg}(h_v: v\in V). \end{equation}\) The graph representation learning module is the basic framework of subsequent work. After obtaining graph-level and node-level representations, we perform subsequent cross-graph interaction computation in the following.

3.4 Cross-graph Contrastive Matching

One of the primary limitations of the cross-graph matching mechanism is high computational complexity \(O(|V_1||V_2|)\), which limits the computing capacity of GSL. To confront this challenge, we introduce a low-complexity dual cross-graph contrastive learning module, which allows our model to learn cross-level interactions (i.e., node-graph matching and graph-graph matching) both effectively and efficiently. Graph-graph contrastiveness learns coarse-grained matching between graphs, whereas node-graph contrastiveness learns fine-grained matching between nodes and graphs. Our method reduces the complexity to \(O(|V_1|+|V_2|)\), while maintaining competitive performance.

Contrastive Node-Graph Matching. To reduce the excessive calculational cost, we construct a contrastive node-graph matching mechanism for cross-graph information distillation instead of associating node representations in one graph to node representations in its counterpart pairwisely. Aggregating node embeddings in a graph yields a graph-level embedding, which is an average representation of the node embeddings, and the dimensions of it are the same to the node embeddings, we try to use a graph embedding of one graph to take the place of node embeddings in the other graph. To further explore the matching relations between node representations and graph representations, we construct a cross-graph contrastive learning method based on the following intuitions: if two graphs are similar, all node embeddings in \(G_1\) should be close to the graph-level embedding of \(G_2\). Conversely, if two graphs are different, then node embeddings in \(G_1\) and the graph embedding in \(G_2\) should be pushed away from each other. To achieve such strategies, we exploit mutual information to investigate the correlation between the node embeddings \(h_v\) and the graph embedding \(h_G\) [6, 59]. For an input graph pair \((G_1,G_2)\), we define the node-graph mutual information MI\(_{NG}\) between two graphs as (4) \(\begin{equation} \begin{aligned}\texttt {MI}_{NG}(h_{v}, h_{G_2}) := \sum \limits _{v \in |V_1|}\mathbb {E}(\theta \left(h_{v}, h_{G_2} \right) / \tau), \end{aligned} \end{equation}\) where \(\mathbb {E}(\cdot)\) is an exponential function and \(\tau\) is a temperature parameter that regulates the model’s sensitivity to negative samples. The temperature parameter’s role is to modify the level of focus on challenging samples. The sample is distinguished from the most comparable other samples with greater care as the temperature parameter decreases. To guide message interaction, as shown in Figure 3, we use supervision signals to distinguish whether the input pairs are positive (similar) or negative (dissimilar). In the training dataset, each pair of graphs contains a ground truth label, which represents whether the pair is similar or dissimilar. According to these ground truth labels, we regard similar graph pairs as positive samples, and dissimilar pairs as negative samples. A batch with size K contains positive graph pairs and negative graph pairs. We regard \((h_{G_1},h_{G_2})\) as a graph pair, \((h_{G_1},h_{G_2})_p\) as a positive pair and \((h_{G_1},h_{G_2})_k\) as a graph pair in the batch. We expect to maximize the mutual information between node embeddings \(h_v\) in \(G_1\) and the graph embedding \(h_{G_2}\) for positive pairs. For negative pairs, we should minimize the mutual information between \(h_v\) in \(G_1\) and \(h_{G_2}\) in vector space. Therefore, we define the node-graph loss as (5) \(\begin{equation} \ell _{NG}(h_{G_1},h_{G_2}) = -\log \frac{\sum \nolimits _{p=1}^P \texttt {MI}_{NG}(h_{v},h_{G_2})_p}{\sum \nolimits _{k=1}^K \texttt {MI}_{NG}(h_{v},h_{G_2})_k}, v \in G_1, \end{equation}\) where P is the number of positive pairs in batch size K. By means of the negative, the node-graph loss maximizes the mutual information between positive sample pairs as well as minimizes mutual information between negative sample pairs. Noticed that operations on graph pairs should be symmetric for effects on balance. We defined the final loss \(\ell _{NG}\) as the average of the positive pair to reduce inaccuracy: (6) \(\begin{equation} \mathcal {L}_{NG}(h_{G_1}, h_{G_2})=\frac{1}{2} \left[ \ell _{NG}(h_{G_1},h_{G_2})+\ell _{NG}(h_{G_2},h_{G_1})\right]\!. \end{equation}\)

Fig. 3. A brief description of cross-graph contrastive learning module. The dotted line represents the cross-graph interaction.

Contrastive Graph-Graph Matching. We use node-level embeddings to learn cross-graph interactions with fine-grained contrastiveness, and the global information of graphs is also valuable that cannot be ignored. Thus, we further enrich the contrastiveness with a proposed graph-graph matching (i.e., contrasting between graph-level embeddings) to consolidate the learned representations of graphs. Intuitively, if two graphs are similar, they should keep a high agreement, that is, their mutual information should be maximized and vice versa. In one batch, for positive samples, we enlarge the agreement between the two graphs. For negative samples, we expand their differences macroscopically. We define the graph-graph mutual information MI\(_{GG}\) between two graphs as (7) \(\begin{equation} \begin{aligned}\texttt {MI}_{GG}(h_{G_1}, h_{G_2}) := \mathbb {E}(\theta \left(h_{G_1}, h_{G_2} \right) / \tau). \end{aligned} \end{equation}\) Thus, the aforementioned processes can be formulated with the following loss functions: (8) \(\begin{equation} \ell _{GG}(h_{G_1}, h_{G_2}) = -\log \frac{\sum \nolimits _{p=1}^P \texttt {MI}_{GG}(h_{G_1}, h_{G_2})_p}{\sum \nolimits _{k=1}^K \texttt {MI}_{GG}(h_{G_1}, h_{G_2})_k}. \end{equation}\) Because the operations on \(G_1\) and \(G_2\) is interchangeable, we design the average loss to reduce the error, and the final objection is defined as (9) \(\begin{equation} \mathcal {L}_{GG} (h_{G_1}, h_{G_2})= \frac{1}{2} \left[\ell _{GG}(h_{G_1}, h_{G_2}) + \ell _{GG}(h_{G_2}, h_{G_1})\right]\!. \end{equation}\) To take full advantage of supervision signals, we formulate the optimization object by establishing the relationship between ground truth and the calculated similarity score. Specifically, we employ a binary cross-entropy loss (BCE loss) to minimize distance among the calculated similarity score \(\theta (h_{G_1}, h_{G_2})\) and the given similarity indicator y: (10) \(\begin{equation} \mathcal {L}_{BCE}(h_{G_1}, h_{G_2},y) = \sum _{k = 1}^{K}\texttt {BCE}(\theta (h_{G_1},h_{G_2})_k, y). \end{equation}\)

The BCE loss provides a directed optimization goal for the guarantee of representation. It is discovered that the graph-graph matching loss differs from the BCE loss. Because the similarity score is used as the optimization goal in the BCE loss for graph representation learning. Graph-graph loss is a subsidiary part of the cross-graph matching module, which tries to learn complementary information from different graphs.

3.5 Objective Function

We design a combinatorial loss that is composed of (i) the node-graph contrastiveness loss (Equation (6)), (ii) the graph-graph contrastiveness loss (Equation (9)), and the BCE loss for similarity score adjustment (Equation (10)) to train the whole model in an end-to-end manner. The goal is to minimize the combinatorial loss \(\mathcal {L}\), which is calculated as follows: (11) \(\begin{equation} \mathcal {L} = \alpha \mathcal {L}_{NG}+ \beta \mathcal {L}_{GG} +\lambda \mathcal {L}_{BCE}, \end{equation}\) where \(\alpha\), \(\beta\), and \(\lambda\) are weight parameters. We define that \(\alpha\), \(\beta\), and \(\lambda\) as tunable parameters to regulate the three losses’ impacts (\(\mathcal {L}_{NG}\), \(\mathcal {L}_{GG}\), and \(\mathcal {L}_{BCE}\)). We set the constraint \(\alpha +\beta +\lambda = 1\) to ensure these parameters are in range \([0, 1]\) and weightily summed to 1. This demonstrates how each loss function contributes to the overall loss. We employ the gradient descent to minimize \(\mathcal {L}\) so that three kinds of losses can be trained simultaneously. Therefore, the graph representation learning part and the cross-graph contrastive learning part are not isolated but benefit from each other. The BCE loss optimizes the inputs of cross-graph contrastive learning, in turn, cross-graph contrastive learning also promotes the expression of nodes and graphs. Algorithm 1 depicts a brief description of CGSim.

Model advantages. We present the advantages between our proposed method CGSim and recent studies from two perspectives: (1) responsibility of representations and (2) capability of cross-graph matching. First, CGSim integrates graph representation learning and cross-graph contrastive learning into a unified framework, which facilitates the message propagation in and cross graphs. Second, CGSim reduces the time complexity for cross-graph interaction modeling from \(O(|V_1||V_2|)\) (1-versus-all) to \(O(|V_1|+|V_2|)\) (1-versus-1). This is a significant improvement over existing methods such as GMN [26]. The experiments later will show the superior performance of our proposed framework.

4 EXPERIMENTS

Our goal in this section is to illustrate how effective CGSim works and how efficient it is by performing extensive experiments. We will answer the following critical research questions:

Q1: How effective is CGSim compared with the SOTA GNN-based graphic similarity models?

Q2: How do the proposed node-graph loss, graph-graph loss, and BCE loss help with the CGSim model?

Q3: What is the efficiency of the cross-graph contrastive learning module compared with that of existing cross-graph attention mechanisms?

Datasets Description. We implement GSL experiments on the binary code similarity detection datasets. As mentioned in the Introduction, binary code similarity detection is an essential application of GSL. The goal of binary code similarity detection is to check whether the control flow graphs (CFGs) of the two given binary functions are similar. Any pair of graphs contains a label y, implying that the pair of graphs are similar or dissimilar. For a given CFG pair \((G_1,G_2)\), the similarity label \(y = 1\) indicates \(G_1\) and \(G_2\) are similar, otherwise, \(y = -1\) represents two CFGs are dissimilar [47]. We use the similarity score calculated by CGSim to predict whether the two graphs are similar.

We apply CGSim on datasets OpenSSL and FFmpeg [28]. Both OpenSSL and FFmpeg datasets have three subsets named with the graph size range. For example, in OpenSSL [3, 200], “3” means the minimum CFG size, and “200” means the maximum CFG size. Table 2 summarizes the statistics for datasets, where “Graphs” donates the number of graphs in datasets, “AvgN/AvgE” means the average of nodes/edges number in one dataset, and “Classes” refers to the number of classes. It should be emphasized that the general datasets of graph classification tasks cannot be used for the graph similarity learning task. This is because the graph classification task only provides a label to each graph, whereas binary code similarity detection assigns a binary similarity label for two graphs instead of assigning two labels for two graphs.

Table 2.

Datasets	Graphs	AvgN	AvgE	Classes
OpenSSL [3, 200]	73,953	15.73	21.97	4,249
OpenSSL [20, 200]	15,800	44.89	67.15	1,073
OpenSSL [50, 200]	4,308	83.68	127.75	338
FFmpeg [3, 200]	83,008	18.83	27.02	10,376
FFmpeg [20, 200]	31,696	51.02	75.88	7,668
FFmpeg [50, 200]	10,824	90.93	136.83	3,178

View Table

Table 2. Datasets Description

Baselines. To verify the two intuitions we mentioned in Section 3.2, (1) responsibility of representation and (2) capability of cross-graph interaction, we evaluate CGSim from two perspectives, including graph representation learning and GSL. We compare CGSim with the SOTA graph representation learning methods: GGNN [27], GCN [24], GIN [45], and graph similarity learning methods, including SimGNN [2], GMN [26], GraphSim [3], and MGMN [28] as our baselines. MGMN contains three different versions, including MGMN (Max + BiLSTM), MGMN (FCMax + BiLSTM) and MGMN (BiLSTM + BiLSTM), where the contents in parentheses are combinations of different pooling operations. Here are the details of baseline algorithms and settings.

GGNN. Gated graph neural network is a GRU-based classic spatial domain message passing model. We set the number of GGNN layers = 3, the hidden feature size = 100, learning rate = 0.0005, batchsize = 10.
GCN. Graph convolutional network captures nodes’ local structure in the spectral domain to learn high-level node representations. We employ a three-layer GCN, and set the hidden feature size = 100, learning rate = 0.0005.
GIN. In addition to distinguishing different graph structures, graph isomorphism networks that map similar structures to similar embeddings are able to capture their dependencies. We employ a three-layer GIN with a hidden feature size = 100, learning rate = 0.0005.
SimGNN. SimGNN designs a pairwise node embedding comparison matrix, and uses node level embedding to supplement the graph level representation by attention mechanism. In the experiment, we employ a three-layer GCN as an encoder, the hidden feature size = 64, learning rate = 0.001.
GMN. GMN designs a cross graph attention achieve the cross graph interaction. In the experiment, GMN applies a one-layer multi-layer perceptron (MLP) as an encoder. The dimension of the node embedding is 32, learning rate = 0.001.
GraphSim. GraphSim allows the comparison of graph similarity on multi-scales by using fixed dimension vectors. The learning rate = 0.001.
MGMN. MGMN designs a multi-level graph matching network. MGMN provides three different versions with different aggregators, such as Max, FCMax, BiLSTM. To accommodate our experiments, we set the layer number of GNN encoder = 3, the hidden feature size = 100, learning rate = 0.0005.

Evaluation Metrics and Parameter Setting. We split each dataset into training, validation, and testing sets by \(8:1:1\), which follows the same data splits as in MGMN. We employ a three-layer GGNN [27] as our graph encoder with the latent dimension setting as 100 and batch size as 16. We tune temperature parameter \(\tau\) between 0 to 1 and batch size within \(\lbrace 2,4,8,16,32\rbrace\). The area under the ROC curve (AUC) is used as a metric of the performance, while the bold values indicate the best performance of approaches. The average AUC scores and standard deviations for each experiment are presented after five iterations of each experiment.

4.1 Q1: Effectiveness of CGSim

To respond to Q1, we evaluate CGSim comparing with the aforementioned baselines on various datasets. The experimental results for OpenSSL and FFmpeg are shown in Tables 3 and 4. Column “[3, 200]” represents the AUC score on dataset [3, 200], “\(-\)” denotes no gain and “\(\pm\)” indicates a numerical range. Column “Outperform (OP)” shows effectiveness of CGSim over other algorithms. Compared results are quoted from the MGMN [28]. The CGSim model performs the best compared with baselines using the AUC score. We can find those graph similarity methods, SimGNN, GMN, GraphSim and MGMN (Max + BiLSTM) performed comparatively poorly. CGSim outperforms all baseline methods, which can be attributed to the dual cross-graph contrastive mechanism. We develop a node-graph contrastive mechanism and graph-graph contrastive mechanism for cross-graph interaction, which not only promotes the message cross-graph propagation but also enriches the graph representation learning in the process of optimization. Though MGMN also considers a multi-level attention mechanism for cross-graph interaction, it heavily relies on the combination of different pooling operations, (i.e., BiLSTM with learnable parameters). With a naive pooling mechanism, mean pooling, CGSim achieves similar or better results than MGMN, which demonstrates the effectiveness of our proposed dual cross-graph contrastive learning. It is of importance to mention that our proposed method CGSim significantly improves graph similarity learning in OpenSSL [50, 200] and FFmpeg [50, 200], which indicates our approach works well on large graph-size datasets.

Table 3.

Model	[3, 200]	OP	[20, 200]	OP	[50, 200]	OP
GGNN	87.81 \(\pm\) 0.99	\(+11.1\%\)	81.14 \(\pm\) 3.64	\(+20.4\%\)	85.26 \(\pm\) 4.30	\(+14.6\%\)
GCN	89.61 \(\pm\) 1.04	\(+8.1\%\)	75.93 \(\pm\) 0.90	\(+28.6\%\)	73.71 \(\pm\) 1.03	\(+32.5\%\)
GIN	91.45 \(\pm\) 0.16	\(+6.6\%\)	81.18 \(\pm\) 2.10	+20.3\(\%\)	68.22 \(\pm\) 4.36	\(+30.1\%\)
SimGNN	95.96 \(\pm\) 0.31	\(+1.6\%\)	93.58 \(\pm\) 0.82	\(+4.4\%\)	94.25 \(\pm\) 0.85	\(+3.7\%\)
GMN	96.43 \(\pm\) 0.61	\(+1.1\%\)	93.03 \(\pm\) 3.81	\(+5.0\%\)	93.91 \(\pm\) 1.65	\(+4.0\%\)
GraphSim	96.84 \(\pm\) 0.54	\(+0.7\%\)	94.97 \(\pm\) 0.98	\(+2.9\%\)	93.66 \(\pm\) 1.84	\(+4.3\%\)
MGMN (Max + BiLSTM)	94.77 \(\pm\) 1.80	\(+2.9\%\)	97.44 \(\pm\) 0.26	\(+0.2\%\)	94.06 \(\pm\) 1.60	\(+3.9\%\)
MGMN (FCMax + BiLSTM)	96.87 \(\pm\) 0.24	\(+0.6\%\)	97.59 \(\pm\) 0.24	\(+0.1\%\)	95.58 \(\pm\) 1.13	\(+2.2\%\)
MGMN (BiLSTM + BiLSTM)	96.90 \(\pm\) 0.10	\(+0.6\%\)	97.31 \(\pm\) 1.07	\(+0.4\%\)	95.87 \(\pm\) 0.88	\(+1.9\%\)
CGSim (Ours)	97.51 \(\pm\) 0.44	—	97.68 \(\pm\) 1.32	—	97.70 \(\pm\) 1.28	—

View Table

Table 3. Experimental Results Are Presented as AUC Scores on the OpenSSL Datasets, Which Are Expressed as a Percentage (%)

Table 4.

Model	[3, 200]	OP	[20, 200]	OP	[50, 200]	OP
GGNN	91.22 \(\pm\) 0.01	\(+7.7\%\)	81.14 \(\pm\) 3.64	\(+17.5\%\)	87.29 \(\pm\) 1.11	\(+12.6\%\)
GCN	94.65 \(\pm\) 0.01	\(+3.8\%\)	92.73 \(\pm\) 0.85	\(+6.1\%\)	89.06 \(\pm\) 2.56	\(+10.4\%\)
GIN	95.08 \(\pm\) 0.50	\(+3.3\%\)	92.52 \(\pm\) 0.51	\(+6.0\%\)	84.96 \(\pm\) 0.78	\(+15.7\%\)
SimGNN	95.38 \(\pm\) 0.76	\(+3.0\%\)	94.31 \(\pm\) 1.01	\(+4.3\%\)	93.45 \(\pm\) 0.54	\(+5.2\%\)
GMN	94.15 \(\pm\) 0.62	\(+3.9\%\)	95.92 \(\pm\) 1.38	\(+2.6\%\)	94.76 \(\pm\) 0.45	\(+3.7\%\)
GraphSim	97.46 \(\pm\) 0.30	\(+0.6\%\)	96.49 \(\pm\) 0.28	\(+2.0\%\)	94.48 \(\pm\) 0.73	\(+4.0\%\)
MGMN (Max + BiLSTM)	97.44 \(\pm\) 0.32	\(+0.8\%\)	97.84 \(\pm\) 0.40	\(+0.6\%\)	97.22 \(\pm\) 0.36	\(+1.1\%\)
MGMN (FCMax + BiLSTM)	98.07 \(\pm\) 0.06	\(+0.1\%\)	98.29 \(\pm\) 0.10	\(+0.1\%\)	97.83 \(\pm\) 0.11	\(+0.5\%\)
MGMN (BiLSTM + BiLSTM)	97.56 \(\pm\) 0.38	\(+0.7\%\)	98.12 \(\pm\) 0.04	\(+0.3\%\)	97.16 \(\pm\) 0.53	\(+1.2\%\)
CGSim (Ours)	98.21 \(\pm\) 1.06	—	98.38 \(\pm\) 0.91	—	98.30 \(\pm\) 0.70	—

View Table

Table 4. Experimental Results Are Presented as AUC Scores on the FFmpeg Datasets, Which Are Expressed as a Percentage (%)

4.2 Q2: Contribution of \(\mathcal {L}_{NG}\), \(\mathcal {L}_{GG}\), \(\mathcal {L}_{BCE}\)

To analyze the contribution of \(\mathcal {L}_{NG}\), \(\mathcal {L}_{GG}\), \(\mathcal {L}_{BCE}\), we conduct a combinational experiment on the parameters \(\alpha\), \(\beta\) and \(\lambda\). We implement the CGSim on OpenSSL [50, 200]. Table 5 shows the average performance of five experimental iterations. For a single loss comparison, \(\mathcal {L}_{GG}\) provides the best performance, which indicates graph-graph matching strategy \(\mathcal {L}_{GG}\) is superior to the graph representation learning model without cross-graph interaction (i.e., \(\mathcal {L}_{BCE}\)). For a combination of two losses, the \(\mathcal {L}_{NG} + \mathcal {L}_{BCE}\) achieve the highest performance. The best result is produced by the simultaneous application of three losses, which indicates that all mechanisms are effective. As long as \(\alpha =0.4\), \(\beta =0.4\) and \(\lambda =0.2\), CGSim achieves the best results 97.70 on OpenSSL [50, 200]. Accordingly, results can be concluded that both the dual cross-graph contrastive learning as well as graph representation learning can be used to improve the quality of graph similarity learning.

Table 5.

Loss	\(\alpha\)	\(\beta\)	\(\lambda\)	AUC
\(\mathcal {L}_{NG}\)	1	0	0	91.22
\(\mathcal {L}_{GG}\)	0	1	0	94.05
\(\mathcal {L}_{BCE}\)	0	0	1	94.32
\(\mathcal {L}_{NG}+ \mathcal {L}_{GG}\)	1/2	1/2	0	94.34
\(\mathcal {L}_{NG}+ \mathcal {L}_{BCE}\)	1/2	0	1/2	96.88
\(\mathcal {L}_{GG}+ \mathcal {L}_{BCE}\)	0	1/2	1/2	95.43
\(\mathcal {L}_{NG}+ \mathcal {L}_{GG}+ \mathcal {L}_{BCE}\)	1/3	1/3	1/3	97.21

View Table

Table 5. Effectiveness of Losses on OpenSSL [50, 200]

4.3 Q3: Efficiency of CGSim

In this section, we choose MGMN as the baseline to be compared with CGSim. This is because MGMN is the best-performed baseline according to Tables 3 and 4. Furthermore, MGMN is one of the GSL baselines with the lowest time complexity [28]. We record the running time of training 100 epochs on the cross-graph interaction procedure. Figure 4 shows that CGSim consumes much less time in cross-graph interaction. Compared with MGMN, the speed of CGSim is improved by 2 to 4 times in general, which is a huge improvement. As a result of the experiment, we also demonstrate that CGSim has better efficiency than the baselines we are comparing.

Fig. 4. Running time comparisons on different datasets.

4.4 Others: Analysis of Hyperparameters

In Figure 5, we report the influence of hyperparameters, including parameter \(\tau\) (in Equation (3)) and batch size. These experiments are carried out on the OpenSSL [50, 200] dataset. In Figure 5(a), we report the CGSim performance among a given batch size range. When adding the batch size from 2 to 16, the performance steadily rises. Then, we further increase the batch size, and the performance fluctuates. In Figure 5(b), we report the effect on parameter \(\tau\). There is no doubt that the performance of CGSim improves as the \(\tau\) value grows until it reaches a figure of 0.7. after that, it gradually leveled off. As we can see from the figure above, different \(\tau\) values can have an impact on the model in a significant way. For this reason, we set the \(\tau =0.7\) and batch size as 16 for our model to avoid over-tuning these parameters (such as \(\tau\)) for various datasets and tasks as well as taking resource consumption into consideration.

Fig. 5. Influence on batch size and \(\tau\) .

5 CONCLUSION

Graph representation learning and cross-graph interaction are pivotal for graph similarity learning. The current researches focus on the cross-graph attention mechanism with high complexity. To improve this situation, in this study, we improve graph similarity learning from two aspects: (1) graph representation learning and (2) cross-graph interaction. For (1), we train the graph representation learning module and the dual contrastive learning module jointly. For (2), our method significantly reduces the time complexity of the proposed node-graph matching mechanism and the graph-graph matching mechanism. These two parts are beneficial to each other. Experimental results show that CGSim effectively enhances the effectiveness and efficiency of graph similarity learning.

REFERENCES

[1] Al-Rfou Rami, Perozzi Bryan, and Zelle Dustin. 2019. Ddgk: Learning graph representations for deep divergence graph kernels. In Proceedings of the World Wide Web Conference. 37–48.Google ScholarDigital Library
Reference
[2] Bai Yunsheng, Ding Hao, Bian Song, Chen Ting, Sun Yizhou, and Wang Wei. 2019. SimGNN: A neural network approach to fast graph similarity computation. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM’19), Culpepper J. Shane, Moffat Alistair, Bennett Paul N., and Lerman Kristina (Eds.). ACM, 384–392. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[3] Bai Yunsheng, Ding Hao, Gu Ken, Sun Yizhou, and Wang Wei. 2020. Learning-based efficient graph similarity computation via multi-scale convolutional set matching. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3219–3226.Google ScholarCross Ref
Reference 1Reference 2
[4] Bromley Jane, Bentz James W., Bottou Léon, Guyon Isabelle, LeCun Yann, Moore Cliff, Säckinger Eduard, and Shah Roopak. 1993. Signature verification using a “Siamese” time delay neural network. Int. J. Pattern Recogn. Artific. Intell. 7, 04 (1993), 669–688.Google ScholarCross Ref
Reference
[5] Cardoso Carlota, Sousa Rita T., Köhler Sebastian, and Pesquita Catia. 2020. A collection of benchmark data sets for knowledge graph-based similarity in the biomedical domain. Database 2020 (Nov. 2020). Retrieved from https://academic.oup.com/database/article/doi/10.1093/database/baaa078/5979744?login=false.Google ScholarCross Ref
Reference
[6] Chen Ting, Kornblith Simon, Norouzi Mohammad, and Hinton Geoffrey. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. PMLR, 1597–1607.Google Scholar
Reference 1Reference 2Reference 3
[7] Chen Yujun, Sun Ke, Pu Juhua, Xiong Zhang, and Zhang Xiangliang. 2020. Grapasa: Parametric graph embedding via siamese architecture. Info. Sci. 512 (2020), 1442–1457.Google ScholarDigital Library
Reference
[8] Fu Keren, Fan Deng-Ping, Ji Ge-Peng, Zhao Qijun, Shen Jianbing, and Zhu Ce. 2021. Siamese network for RGB-D salient object detection and beyond. IEEE Trans. Pattern Anal. Mach. Intell. 44, 9 (2021), 5541–5559.Google Scholar
Reference
[9] Gao Tianyu, Yao Xingcheng, and Chen Danqi. 2021. SimCSE: Simple contrastive learning of sentence embeddings. Retrieved from https://arXiv:2104.08821.Google Scholar
Reference
[10] Gori Marco, Maggini Marco, and Sarti Lorenzo. 2005. Exact and approximate graph matching using random walks. IEEE Trans. Pattern Anal. Mach. Intell. 27, 7 (2005), 1100–1111.Google ScholarDigital Library
Reference
[11] Gori Marco, Maggini Marco, and Sarti Lorenzo. 2005. The RW2 algorithm for exact graph matching. In Proceedings of the International Conference on Pattern Recognition and Image Analysis. Springer, 81–88.Google ScholarDigital Library
Reference
[12] Grill Jean-Bastien, Strub Florian, Altché Florent, Tallec Corentin, Richemond Pierre H., Buchatskaya Elena, Doersch Carl, Pires Bernardo Avila, Guo Zhaohan Daniel, Azar Mohammad Gheshlaghi et al. 2020. Bootstrap your own latent: A new approach to self-supervised learning. Retrieved from https://arXiv:2006.07733.Google Scholar
Reference
[13] Hartmanis Juris. 1982. Computers and intractability: A guide to the theory of np-completeness (Michael R. Garey and David S. Johnson). Siam Rev. 24, 1 (1982), 90.Google ScholarDigital Library
Reference
[14] Hassani Kaveh and Khasahmadi Amir Hosein. 2020. Contrastive multi-view representation learning on graphs. In Proceedings of the International Conference on Machine Learning. PMLR, 4116–4126.Google Scholar
Reference 1Reference 2
[15] He Kaiming, Fan Haoqi, Wu Yuxin, Xie Saining, and Girshick Ross. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729–9738.Google ScholarCross Ref
Reference
[16] Hlaoui Adel and Wang Shengrui. 2002. A new algorithm for inexact graph matching. In Object Recognition Supported by User Interaction for Service Robots, Vol. 4. IEEE, 180–183.Google Scholar
Reference
[17] Jiang Bo, Tang Jin, Cao Xiaochun, and Luo Bin. 2017. Lagrangian relaxation graph matching. Pattern Recogn. 61 (2017), 255–265.Google ScholarDigital Library
Reference
[18] Jin Di, Huo Cuiying, Liang Chundong, and Yang Liang. 2021. Heterogeneous graph neural network via attribute completion. In Proceedings of the Web Conference. 391–400.Google ScholarDigital Library
Reference
[19] Jin Di, Wang Luzhi, Zheng Yizhen, Li Xiang, Jiang Fei, Lin Wei, and Pan Shirui. 2022. CGMN: A contrastive graph matching network for self-supervised graph similarity learning. Retrieved from https://arXiv:2205.15083.Google Scholar
Reference
[20] Jin Di, Yu Zhizhi, Jiao Pengfei, Pan Shirui, Yu Philip S., and Zhang Weixiong. 2021. A survey of community detection approaches: From statistical modeling to deep learning. Retrieved from https://arxiv.org/abs/2101.01669.Google Scholar
Reference
[21] Jin Ming, Zheng Yizhen, Li Yuan-Fang, Gong Chen, Zhou Chuan, and Pan Shirui. 2021. Multi-scale contrastive siamese networks for self-supervised graph representation learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’21), Zhou Zhi-Hua (Ed.). ijcai.org, 1477–1483. DOI:Google ScholarCross Ref
Reference
[22] Kanezashi Hiroki, Suzumura Toyotaro, Garcia-Gasulla Dario, Oh Min-hwan, and Matsuoka Satoshi. 2018. Adaptive pattern matching with reinforcement learning for dynamic graphs. In Proceedings of the IEEE 25th International Conference on High Performance Computing (HiPC’18). IEEE, 92–101.Google ScholarCross Ref
Reference
[23] Khosla Prannay, Teterwak Piotr, Wang Chen, Sarna Aaron, Tian Yonglong, Isola Phillip, Maschinot Aaron, Liu Ce, and Krishnan Dilip. 2020. Supervised contrastive learning. Retrieved from https://arXiv:2004.11362.Google Scholar
Reference
[24] Kipf Thomas N. and Welling Max. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17). OpenReview.net. Retrieved from https://openreview.net/forum?id=SJU4ayYgl.Google Scholar
Reference 1Reference 2Reference 3
[25] Koutra Danai, Parikh Ankur, Ramdas Aaditya, and Xiang Jing. 2011. Algorithms for graph similarity and subgraph matching. In Proc. Ecol. Inference Conf., Vol. 17.Google Scholar
Reference
[26] Li Yujia, Gu Chenjie, Dullien Thomas, Vinyals Oriol, and Kohli Pushmeet. 2019. Graph matching networks for learning the similarity of graph structured objects. In Proceedings of the 36th International Conference on Machine Learning (ICML’19) (Proceedings of Machine Learning Research), Chaudhuri Kamalika and Salakhutdinov Ruslan (Eds.), Vol. 97. PMLR, 3835–3845. Retrieved from http://proceedings.mlr.press/v97/li19d.html.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[27] Li Yujia, Tarlow Daniel, Brockschmidt Marc, and Zemel Richard S.. 2016. Gated graph sequence neural networks. In Proceedings of the 4th International Conference on Learning Representations (ICLR’16), Bengio Yoshua and LeCun Yann (Eds.). Retrieved from http://arxiv.org/abs/1511.05493.Google Scholar
Reference 1Reference 2Reference 3
[28] Ling Xiang, Wu Lingfei, Wang Saizhuo, Ma Tengfei, Xu Fangli, Liu Alex X., Wu Chunming, and Ji Shouling. 2021. Multilevel graph matching networks for deep graph similarity learning. IEEE Trans. Neural Netw. Learn. Syst. (2021), 1–15. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[29] Liu Yixin, Li Zhao, Pan Shirui, Gong Chen, Zhou Chuan, and Karypis George. 2021. Anomaly detection on attributed networks via contrastive self-supervised learning. IEEE Trans. Neural Netw. Learn. Syst. 33, 6 (2021), 2378–2392.Google ScholarCross Ref
Reference
[30] Ma Guixiang, Ahmed Nesreen K., Willke Theodore L., and Philip S. Yu. 2021. Deep graph similarity learning: A survey. Data Min. Knowl. Discov. 35, 3 (2021), 688–725Google ScholarCross Ref
Reference
[31] Ma Guixiang, Ahmed Nesreen K., Willke Theodore L., Sengupta Dipanjan, Cole Michael W., Turk-Browne Nicholas B., and Yu Philip S.. 2019. Deep graph similarity learning for brain data analysis. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2743–2751.Google ScholarDigital Library
Reference
[32] Mikolov Tomas, Sutskever Ilya, Chen Kai, Corrado Greg S., and Dean Jeff. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. MIT Press, 3111–3119.Google ScholarDigital Library
Reference
[33] Papadimitriou Panagiotis, Dasdan Ali, and Garcia-Molina Hector. 2010. Web graph similarity for anomaly detection. J. Internet Serv. Appl. 1, 1 (2010), 19–30.Google ScholarCross Ref
Reference
[34] Perozzi Bryan, Al-Rfou Rami, and Skiena Steven. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 701–710.Google ScholarDigital Library
Reference
[35] Sarlin Paul-Edouard, DeTone Daniel, Malisiewicz Tomasz, and Rabinovich Andrew. 2020. SuperGlue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 4937–4946. DOI:Google ScholarCross Ref
Reference
[36] Tang Jian, Qu Meng, Wang Mingzhe, Zhang Ming, Yan Jun, and Mei Qiaozhu. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. 1067–1077.Google ScholarDigital Library
Reference
[37] Velickovic Petar, Cucurull Guillem, Casanova Arantxa, Romero Adriana, Liò Pietro, and Bengio Yoshua. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18). OpenReview.net. Retrieved from https://openreview.net/forum?id=rJXMpikCZ.Google Scholar
Reference 1Reference 2
[38] Velickovic Petar, Fedus William, Hamilton William L., Liò Pietro, Bengio Yoshua, and Hjelm R. Devon. 2019. Deep graph infomax. In Proceedings of the International Conference on Learning Representations (ICLR’19).Google Scholar
Reference
[39] Wang Runzhong, Yan Junchi, and Yang Xiaokang. 2019. Learning combinatorial embedding networks for deep graph matching. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). IEEE, 3056–3065. DOI:Google ScholarCross Ref
Reference
[40] Wang Runzhong, Yan Junchi, and Yang Xiaokang. 2020. Combinatorial learning of robust deep graph matching: An embedding based approach. IEEE Trans. Pattern Anal. Mach. Intell. (2020), 1. DOI:Google ScholarDigital Library
Reference
[41] Wang Runzhong, Yan Junchi, and Yang Xiaokang. 2020. Graduated assignment for joint multi-graph matching and clustering with application to unsupervised graph matching network learning. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS’20), Larochelle Hugo, Ranzato Marc’Aurelio, Hadsell Raia, Balcan Maria-Florina, and Lin Hsuan-Tien (Eds.). Retrieved from https://proceedings.neurips.cc/paper/2020/hash/e6384711491713d29bc63fc5eeb5ba4f-Abstract.html.Google Scholar
Reference
[42] Wu Man, Pan Shirui, and Zhu Xingquan. 2022. Attraction and repulsion: Unsupervised domain adaptive graph contrastive learning network. IEEE Trans. Emerg. Top. Comput. Intell. 6, 5 (2022), 1079–1091.Google ScholarCross Ref
Reference
[43] Wu Zonghan, Pan Shirui, Long Guodong, Jiang Jing, and Zhang Chengqi. 2022. Beyond low-pass filtering: Graph convolutional networks with automatic filtering. IEEE Trans. Knowl. Data Eng. (2022), 1–12. DOI:Google ScholarDigital Library
Reference
[44] Wu Zonghan, Zheng Da, Pan Shirui, Gan Quan, Long Guodong, and Karypis George. 2022. TraverseNet: Unifying space and time in message passing for traffic forecasting. IEEE Trans. Neural Netw. Learn. Syst. (2022), 1–11. DOI:Google ScholarCross Ref
Reference
[45] Xu Keyulu, Hu Weihua, Leskovec Jure, and Jegelka Stefanie. 2019. How powerful are graph neural networks? In Proceedings of the 7th International Conference on Learning Representations (ICLR’19). OpenReview.net. Retrieved from https://openreview.net/forum?id=ryGs6iA5Km.Google Scholar
Reference 1Reference 2Reference 3
[46] Xu Kun, Wang Liwei, Yu Mo, Feng Yansong, Song Yan, Wang Zhiguo, and Yu Dong. 2019. Cross-lingual knowledge graph alignment via graph matching neural network. Retrieved from https://arXiv:1905.11605.Google Scholar
Reference
[47] Xu Xiaojun, Liu Chang, Feng Qian, Yin Heng, Song Le, and Song Dawn. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’17), Thuraisingham Bhavani M., Evans David, Malkin Tal, and Xu Dongyan (Eds.). ACM, 363–376. DOI:Google ScholarDigital Library
Reference 1Reference 2
[48] Yanardag Pinar and Vishwanathan S. V. N.. 2015. Deep graph kernels. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1365–1374.Google ScholarDigital Library
Reference
[49] You Yuning, Chen Tianlong, Sui Yongduo, Chen Ting, Wang Zhangyang, and Shen Yang. 2020. Graph contrastive learning with augmentations. Adv. Neural Info. Process. Syst. 33 (2020), 5812–5823.Google Scholar
Reference
[50] Yu Tianshu, Wang Runzhong, Yan Junchi, and Li Baoxin. 2020. Learning deep graph matching with channel-independent embedding and Hungarian attention. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20). OpenReview.net. Retrieved from https://openreview.net/forum?id=rJgBd2NYPH.Google Scholar
Reference
[51] Yu Zhizhi, Jin Di, Liu Ziyang, He Dongxiao, Wang Xiao, Tong Hanghang, and Han Jiawei. 2021. AS-GCN: Adaptive semantic architecture of graph convolutional networks for text-rich networks. In Proceedings of the IEEE International Conference on Data Mining (ICDM’21). IEEE, 837–846.Google ScholarCross Ref
Reference
[52] Zager Laura A. and Verghese George C.. 2008. Graph similarity scoring and matching. Appl. Math. Lett. 21, 1 (2008), 86–94.Google ScholarCross Ref
Reference
[53] Zanfir Andrei and Sminchisescu Cristian. 2018. Deep learning of graph matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, 2684–2693. DOI:Google ScholarCross Ref
Reference
[54] Zbontar Jure, Jing Li, Misra Ishan, LeCun Yann, and Deny Stéphane. 2021. Barlow twins: Self-supervised learning via redundancy reduction. In Proceedings of the 38th International Conference on Machine Learning (ICML’21) (Proceedings of Machine Learning Research), Meila Marina and Zhang Tong (Eds.), Vol. 139. PMLR, 12310–12320. Retrieved from http://proceedings.mlr.press/v139/zbontar21a.html.Google Scholar
Reference 1Reference 2
[55] Zhang He, Wu Bang, Yuan Xingliang, Pan Shirui, Tong Hanghang, and Pei Jian. 2022. Trustworthy graph neural networks: Aspects, methods and trends. Retrieved from https://arXiv:2205.07424.Google Scholar
Reference
[56] Zheng Yizhen, Pan Shirui, Lee Vincent Cs, Zheng Yu, and Yu Philip S.. 2022. Rethinking and scaling up graph contrastive learning: An extremely efficient approach with group discrimination. Retrieved from https://arXiv:2206.01535.Google Scholar
Reference 1Reference 2
[57] Zheng Yizhen, Zheng Yu, Zhou Xiaofei, Gong Chen, Lee Vincent, and Pan Shirui. 2022. Unifying graph contrastive learning with flexible contextual scopes. Retrieved from https://arXiv:2210.08792.Google Scholar
Reference
[58] Zhou Feng and Torre Fernando De la. 2012. Factorized graph matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 127–134.Google Scholar
Reference
[59] Zhu Yanqiao, Xu Yichen, Yu Feng, Liu Qiang, Wu Shu, and Wang Liang. 2020. Deep graph contrastive representation learning. In Proceedings of the 37th International Conference on Machine Learning Workshop on Graph Representation Learning and Beyond.Google Scholar
Reference 1Reference 2

Index Terms

Contrastive Graph Similarity Networks

Index terms have been assigned to the content through auto-classification.

Recommendations

H2MN: Graph Similarity Learning with Hierarchical Hypergraph Matching Networks
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Graph similarity learning, which measures the similarities between a pair of graph-structured objects, lies at the core of various machine learning tasks such as graph classification, similarity search, etc. In this paper, we devise a novel graph neural ...
Read More
Contrastive Learning for Signed Bipartite Graphs
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

This paper is the first to use contrastive learning to improve the robustness of graph representation learning for signed bipartite graphs, which are commonly found in social networks, recommender systems, and paper review platforms. Existing contrastive ...
Read More
Graph Self-supervised Learning with Augmentation-aware Contrastive Learning
WWW '23: Proceedings of the ACM Web Conference 2023

Graph self-supervised learning aims to mine useful information from unlabeled graph data, and has been successfully applied to pre-train graph representations. Many existing approaches use contrastive learning to learn powerful embeddings by learning ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on the Web Volume 18, Issue 2
May 2024
378 pages
ISSN:1559-1131
EISSN:1559-114X
DOI:10.1145/3613666
Editor:
White Ryen
Microsoft Research, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 January 2024
- Online AM: 30 January 2023
- Accepted: 20 October 2022
- Revised: 31 August 2022
- Received: 31 January 2022
Published in tweb Volume 18, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph similarity learning
graph neural networks
contrastive learning
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 1,547
  Total Downloads
- Downloads (Last 12 months)1,294
- Downloads (Last 6 weeks)361
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Contrastive Graph Similarity Networks

ACM Transactions on the Web

Abstract

1 INTRODUCTION

2 RELATED WORK

2.1 Pairwise Graph Similarity Computation

2.2 Graph Representation Learning (GRL)

2.3 Contrastive Learning

3 METHOD

3.1 Problem Definition

(Graph Similarity Score).

3.2 Solution Overview

3.3 Representation Learning

3.4 Cross-graph Contrastive Matching

3.5 Objective Function

4 EXPERIMENTS

4.1 Q1: Effectiveness of CGSim

4.2 Q2: Contribution of \(\mathcal {L}_{NG}\), \(\mathcal {L}_{GG}\), \(\mathcal {L}_{BCE}\)

4.3 Q3: Efficiency of CGSim

4.4 Others: Analysis of Hyperparameters

5 CONCLUSION

REFERENCES

Cited By

Index Terms

Recommendations

H2MN: Graph Similarity Learning with Hierarchical Hypergraph Matching Networks

Contrastive Learning for Signed Bipartite Graphs

Graph Self-supervised Learning with Augmentation-aware Contrastive Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media