NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for Graph Transformer", ICLR 2022

Van Thuy Hoang
Dept. of Artificial Intelligence,
The Catholic University of Korea
hoangvanthuy90@gmail.com
Park et al., ICLR 2022

2
 Encode absolution position in the sequence of nodes
 Encode relative position with another node using bias
terms
 propose relative positional encoding for a graph to
overcome the weakness of the previous approaches
https://github.com/ Namkyeong/AFGRL

3
Problems
 Explicit representations of position in graph convolutional networks
are lost, thus incorporating graph structure on the hidden
representations of self-attention is a key challenge.
 linearizing graph with graph Laplacian to encode the absolute
position of each node
 loses preciseness of position due to linearization
 encoding position relative to another node with bias terms
 loses a tight integration of node-edge and node-spatial
information

4
Problems
 Introduce two sets of learnable positional encoding vectors to
represent spatial relation or edge between two nodes.
 Considers the interaction between:
 node features
 the two encoding vectors
To integrate both node-spatial relation and node-edge information

5
BACKGROUND
 The self-attention module computes query q, key k, and value v with
independent linear transformations:
 The attention map is computed by applying a scaled dot product
between the queries and the keys:
 The self-attention module outputs the next hidden feature by
applying weighted summation on the values

6
GRAPH WITH TRANSFORMER
 Graphormer adopted two additional terms on the self-attention
module to encode graph information on the attention map.
 GT: The graph Laplacian represents the structure of a graph with
respect to node:

7
The proposed Graph Relative Positional Encoding
 Left shows an example of how GRPE process relative relation
between nodes. In the example we set the L to 2.
 Right describes our self-attention mechanism.
 Two relative positional encodings, spatial encoding and edge
encoding, are used to encode graph on both attention map and
value.

8
NODE-AWARE ATTENTION
 two terms to encode graph on the attention map with two newly
proposed encodings
 The 1st term:
 It encodes graph by considering interaction between node feature
and spatial relation in graph.
 The 2nd term:
 It encodes graph by considering interaction between node feature
and edge in graph.

9
Problems
 Two terms consider node-spatial relation and node-edge relation,
but Graphormer did not consider the interaction with node feature.
 Finally, the two terms are added to scaled dot product attention map
to encode graph information.

10
GRAPH-ENCODED VALUE
 to encode a graph to the hidden features of self-attention, when
values are weighted summed with the attention map.
 encode both spatial encoding and edge encoding into value via
summation:
 directly encodes graph information into the hidden features of value.

11
EXPERIMENT
 VIRTUAL NODE
 The role of a virtual node is similar to special tokens such as a
classification token

12
EXPERIMENT
 Results on ZINC

13
EXPERIMENT
 Results on MolHIV
 Results on MolPCBA

14
Problems
 Effects of components of GRPE on ZINC datasets (The lower the
better)
 Empirically, sharing the encodings does not significantly change the
performance of a model especially on large datasets.

NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for Graph Transformer", ICLR 2022

NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for Graph Transformer", ICLR 2022

More Related Content

Similar to NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for Graph Transformer", ICLR 2022

More from ssuser4b1f48

Recently uploaded

NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for Graph Transformer", ICLR 2022