Cola

GCN简单理解[未完待续]

2022.11.12

GCN公式讲解

$$ H^{(l+1)} = \sigma（\widetilde{D}^{-\frac{1}{2}}\widetilde{A}\widetilde{D}^{-\frac{1}{2}}H^{(L)}W^{(L)}） $$

$A$:图结构的邻接矩阵
$\widetilde{A}$:有自连接的邻接矩阵
$\widetilde{D}$: 有自连接的临接矩阵的度矩阵
$\widetilde{D}\_{ii} = \sum_{j} \widetilde{A}_{ij}$
$H$: 图节点的特征
若$$ \widetilde{D} = \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & 2 & 0 \\\\ 0 & 0 & 3 \end{bmatrix}$$
那么
$$ \widetilde{D}^{-1} = \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & \frac{1}{2} & 0 \\\\ 0 & 0 & \frac{1}{3} \end{bmatrix}$$
其中$\widetilde{D}^{-1}为\widetilde{D}$的逆矩阵

GCN的输入是邻接矩阵A和节点特征H，直接做内积，再乘一个参数矩阵W，然后激活一下，不就相当于一个的神经网络层？为什么要有自连接的邻接矩阵?
提示：无法区分“自身节点”与“无连接节点”。只使用A的话，由于A的对角线上都是0，所以在和特征矩阵H相乘的时候，只会计算一个节点的所有邻居的特征的加权和，该节点本身的特征却被忽略了。
为什么需要有自连接的邻接矩阵的度矩阵？
提示：A是没有经过归一化的矩阵，这样与特征矩阵H相乘会改变特征原本的分布，所以对A做一个标准化处理。平衡度很大的节点的重要性。（对称归一化拉普拉斯矩阵）
$$ NormA_{ij}=\frac{A_{ij}}{\sqrt{d_{i}}\sqrt{d_{i}}} $$

GCN底层讲解

Data Handling of Graphs

```
edge_index = torch.tensor([[0,1,1,2],[1,0,2,1]],dtype = torch.long)
```
其中 [[0,1,1,2],[1,0,2,1]] 代表0节点到1节点，1到0、2节点，2到1节点

x = torch.tensor([[-1], [0], [1]], dtype=torch.float)

节点的特征分别为 -1、0、1

edge_index可写为以下两种格式

edge_index = torch.tensor([[0, 1, 1, 2],
                           [1, 0, 2, 1]], dtype=torch.long)

edge_index = torch.tensor([[0, 1],
                           [1, 0],
                           [1, 2],
                           [2, 1]], dtype=torch.long)

通过以下转换格式 edge_index=edge_index.t().contiguous()

底层代码讲解

Add self-loops to the adjacency matrix. (A矩阵加上自连接矩阵)
Linearly transform node feature matrix. ($H * W^{T}$)
Compute normalization coefficients.
Normalize node features
Sum up neighboring node features ("add" aggregation).

import torch
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree

class GCNConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super().__init__(aggr='add')  # "Add" aggregation (Step 5).
        self.lin = torch.nn.Linear(in_channels, out_channels)

    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        # Step 1: Add self-loops to the adjacency matrix.
        edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

        # Step 2: Linearly transform node feature matrix.
        x = self.lin(x)

        # Step 3: Compute normalization.
        row, col = edge_index
        deg = degree(col, x.size(0), dtype=x.dtype)
        deg_inv_sqrt = deg.pow(-0.5)
        deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
        # The result is saved in the tensor norm of shape [num_edges, ]
        norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]

        # Step 4-5: Start propagating messages.
        return self.propagate(edge_index, x=x, norm=norm)

    def message(self, x_j, norm):
        # x_j has shape [E, out_channels]

        # Step 4: Normalize node features.
        return norm.view(-1, 1) * x_j