Significantly Improved Performance Of Graph Neural Networks By Focusing On Dual Connectivity

GNN 31/05/2024

3 main points
✔️ Focusing on the expressive power of graph dual connectivity, which has not received much attention in graph neural networks (GNNs)
✔️ Found that existing GNNs have poor expressive power for dual connectivity
✔️ Proposed Graphormer-GD as a Transformer-based fast GNN with perfect expressivity for dual connectivity

RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY
written by Bohang Zhang, Shengjie Luo, Liwei Wang, Di He
(Submitted on Submitted on 23 Jan 2023 (v1), last revised 11 Feb 2024 (this version, v3))
Comments: Extended from ICLR 2023 Outstanding Paper; 60 pages, 12 figures. Fix typos in the previous version
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Introduction

Graph neural networks (GNNs) were first proposed by Gori and Scarcelli in 2005. A graph is a relation represented by vertices (nodes) and edges (edges) connecting the vertices to each other; the goal of a GNN is to learn features that aggregate the features of each node's neighbors according to the graph structure of the given data.

Based on the learned features, for example, it predicts the label given to each node; the effect of using GNNs is that it is easier to successfully guess the unknown label of a node based on its proximity on the graph.

However,depending on how the features of each node's neighbors are aggregated according to the graph structure of the given data, features that do not reflect differences in graph structure may be learned.

In short, the expectation is that different features should be learned when features are learned according to different graph structures, but depending on the type of GNN, it is possible that the resulting features may be identical even though they are learned based on different graph structures.

The Weisfeiler-Lehman test (WL test) is often used as a comparison for the representativeness of GNN graphs. Isomorphism means that the nodes of the graphs are connected in the same way, just in a different order. The only difference is the numbering scheme that identifies the nodes of the graph.

The WL test is a classic method proposed in 1968 to determine the isomorphism of this graph. The test is such that if the nodes of a graph are painted according to a certain rule, graphs with identical color histograms are considered isomorphic. Although it has the drawback that graphs with the same number of nodes and degree are considered isomorphic, it is considered a fast and powerful method of judgment.

While there are many GNN studies that have examined whether it is possible to learn features that can correctly determine graph isomorphism as well as or better than the WL test, there are no GNN studies that focus on the dual connectivity of the graph.

In contrast, the paper presented here focuses on graph dual connectivity. We surveyed the expressive power of existing GNN methods for graph dual connectivity and found that most methods lack expressive power with respect to graph dual connectivity.

Furthermore, we propose Graphormer-GD as a Transformer-based computationally efficient GNN with high expressive power with respect to graph dual connectivity. This allows us to achieve high performance in benchmarking real-world problems. This paper was selected as an Outstanding Paper for ICLR 2023.

In the following sections, we will discuss thedual connectivity of graphs, the WL test, GD-WL, and Graphormer-GD and the results of their evaluation.

Dual connectivity of graphs

I explained that the paper presented here focuses on the dual connectivity of graphs, but what does it mean for a graph to have dual connectivity in the first place?

Dual connectivity of a graph refers to a graph such that if one node (or edge) is removed from a graph, the connectivity of the graph (such that one can reach all nodes from a node in the graph through the edges) is preserved.

If it is satisfied even if a node is excluded, it is called node dual connectivity; if it is satisfied even if an edge is excluded, it is called edge connectivity in particular.

Conversely, if removing a node in a graph causes it to lose connectivity, that node is called a cut node. If removing an edge causes it to lose connectivity, the edge is called a cut edge.

Examples of cut nodes and cut edges are shown in Figure 1.

Figure 1. dual connectivity of graphs (a) original graph (b) cut-edge tree (c) cut-node (vertex) tree

Figure 1(a) shows the original graph. The red edge shown in Figure 1(b) is a cut edge. Removing this cut edge causes the graph to lose its connectivity.The orange-edged node in Figure 1(c) is a cut node. If this node is removed, the connectivity of the graph is lost.

Dual connectivity of graphs itself is a property that has long been the focus of attention in graph theory. In terms of application, for example, in the case of a communication network, if one device somewhere on the communication network breaks down, communication will be disrupted. Therefore, a communication network must be created in such a way that there are no cut nodes. In this sense, it can be an important property in the real world.

For a forecasting task such as this, where the dual connectivity of the graph is an important feature, we would expect that using a GNN that cannot represent the dual connectivity of the graph would significantly reduce forecasting accuracy.

WL test (color adjustment algorithm)

There are many different types of WL tests, but we will discuss the most basic first-order WL test and explain that its challenge is that it does not identify dual connectivity well.

WL test of the first order

The first-order WL test is also called the color refinement algorithm.

First, if the nodes are not colored, initialize them with an obvious color. For example, if the graph is a molecular structure, atoms are considered nodes and bonds between atoms are considered edges, and initialization is such that different atom names are given different colors.

Next, for each node, create a pair of its own color and a multiset (a set that may contain more than one of the same element) of the colors of its one-neighbor neighbor nodes. For each node's pair (its own color and the multiset of colors of its neighbors), a different color is assigned to each unique pair, corresponding one-to-one. This process is repeated until the color before and after adjustment is no longer different.

For two graphs, the histograms of the color of the graphs at the end are compared, and if they are the same, the graphs are determined to be isomorphic; otherwise, they are determined to be nonisomorphic.

Identifiability of dual connectivity by first-order WL test

One reason for focusing on dual connectivity when considering the expressive power of GNNs is that in some cases, the first-order WL test does not identify dual connectivity.

As shown in Figure 2, there exist examples where the first-order WL test cannot identify isomorphisms in graphs with dual connectivity.

Figure 2: Example of WL test failing to determine isomorphism with graphs with dual connectivity (graphs with different connections are determined to be isomorphic)

In Figure 2, the graph in the first row is a graph with dual connectivity, where the nodes bordered in red are cut nodes and the red lines are cut edges; the graph in the second row is a graph without dual connectivity. Comparing the graphs in rows (a)~(d) with each other, the connections are different. In other words, they are non-isomorphic graphs. However, as for the WL test, since the histograms of colors (the number of occurrences of each color) are the same, the WL test has misjudged them as isomorphic.

Thus, since a simple WL test cannot adequately represent dual connectivity, a WL test that can represent graphs with dual connectivity is desired.

Proposed WL Test GD-WL

In this paper, we mathematically prove that in the WL test, the distance information between nodes can be used to have the expressive power of dual connectivity. A color adjustment algorithm using a generalized distance measure between nodes is the new algorithm proposed in this paper.

GD-WL updates the "multiset of pairs" (distance of (v,u), color of u) of node v for u ∈ [set of nodes in the graph] by giving a unique color to the unique "multiset of pairs".

We will call SPD-WL when the shortest path distance is used as the distance of (v,u) in GD-WL. The shortest path distance is the shortest number of edge passes required to go from node v to node u. SPD-WL is proved in this paper to be fully discriminative with respect to edge dual connectivity. It is not fully discriminative with respect to node dual connectivity and cannot discriminate between the two graphs in the columns of Figure 2(c).

On the other hand, with respect to edge dual connectivity, this paper proves that when resistance distance is used (RD-WL), there is complete discriminability with respect to node dual connectivity. Resistance distance is the effective resistance from node v to node u as a distance, when the graph is considered as an electric circuit and the edge is considered as a resistance of 1 ohm.

Thus, by changing the distance index, both edge dual connectivity and node dual connectivity can be identified, and thus, in GD-WL, both SPD-WL and RD-WL can be used for complete identification of dual connectivity.

However, there are some graphs for which isomorphism determination cannot be performed even when GD-WL is used with PD-WL and RD-WL simultaneously. These are shown in Figure 3.

Figure 3. Example of graph identification failure in SPD-WL and RD-WL

Figure 3(a) shows an example where SPD-WL fails to identify a graph, but RD-WL succeeds in identifying a graph, and Figure 3(b) shows an example where both SPD-WL and RD-WL fail to identify a graph.

Proposed GNN Graphormer-GD

GD-WL can be implemented by incorporating the distance information between nodes into Transformer's multi-head attention, and GD-WL can be computed efficiently by taking advantage of Transformer's parallel processing. The proposed method is named Graphormer-GD after Graphormer, which is a graph neural network based on Transformer.

Graphormer-GD evaluation results

This section describes the evaluation results of Graphormer-GD, the GNN proposed in this paper.

Cut node and cut edge detection tasks

Table 1shows theevaluation results of the tasks to detect cut nodes and cut nodes.

Table 1.Cut vertex and cut edge detection tasks

While existing GNNs (GCN~Graphormer) have an accuracy of less than 85%,Graphormer-GDachieves an overwhelming 100% accuracy. With respect to dual connectivity, we can confirm the lack of expressive power of existing graph neural networks.

Also, as per theory, the proposedGraphormer-GD confirms that the accuracy of node dual connectivity detection deteriorates when the resistance distance is removed (w/o. Resistance Distance).

Evaluation on real-world problem benchmarks

As a benchmark for real-world problems, we evaluated accuracy on a dataset called ZINC, which predicts chemical properties from molecular graphs.

To check the scalability of Graphormer-GD, we evaluated ZINC-Subset (12,000 molecular graphs) consisting of a subset of ZINC data and ZINC-Full (250,000 molecular graphs) consisting of all data, respectively, and Table 2 shows the results.

Params in Table 2 are the number of model parameters for GNN. For fairness, the model parameter size of the proposed Graphormere-GD is about the same size as other methods.

The lower the Test MAE (Mean Absolute Error) in Table 2, the more accurate the prediction, and GD-WL is the most accurate compared to the existing methods in Table 2.

By increasing the discriminability of dual connectivity, which has not been focused on, we were able to improve the performance of GNNs.

Time in Table 2 is the training time per training epoch, and it shows that GD-WL requires only about the same processing time as the other methods in Table 2.

From the above, it can be seen that the GD-WL offers both accuracy and speed.

Conclusion

In the paper presented here, it was shown that the performance of GNNs can be improved by improving the expressiveness of the dual connectivity of graphs, which has not been focused on in terms of the expressiveness of GNNs.

The WL test, which is often discussed in connection with the expressive power of GNNs, showed that in some cases the WL test fails to identify the dual connectivity of a graph, indicating that the WL test needs to be improved.

We proposed GD-WL as a WL test using distance information between nodes andshowed that it can identify dual connectivity.

Furthermore, we showed that GD-WL can be implemented simply by learning from existing Transformer-based GNN methods and proposed it as Graphormer-GD.

Graphormer-GD showed overwhelming performance on the task of detecting dual connectivity and achieved the best accuracy compared to existing methods on real-world benchmark problems.

It was a very heavy paper that combined theory and practice, as it was the only paper selected as Outstanding Paper at ICLR2023, the top conference on AI. Including the appendices, it runs to 60 pages. There are many proofs , many illustrations, and many existing methods to compare. The possibilities, limitations, and extensions of the proposals are thoroughly pursued. The paper seemed to me to be motivated by strong interest and inquisitiveness.

The paper presented here is a paper published by Peking University. In recent years, I have heard that 50% of the papers accepted by NeurIPS are by researchers from China. I was impressed by the high level of research power in China. The first author seems to be from the School of the Gifted Young, and although I have not researched it in detail, from the looks of this letter, Chinese geniuses may be gathering for AI research.

Categories related to this article

Yokota