GCN简单理解[未完待续]
GCN公式讲解
$$ H^{(l+1)} = \sigma(\widetilde{D}^{-\frac{1}{2}}\widetilde{A}\widetilde{D}^{-\frac{1}{2}}H^{(L)}W^{(L)}) $$
- $A$:图结构的邻接矩阵
- $\widetilde{A}$:有自连接的邻接矩阵
- $\widetilde{D}$: 有自连接的临接矩阵的度矩阵
- $\widetilde{D}\_{ii} = \sum_{j} \widetilde{A}_{ij}$
$H$: 图节点的特征
若$$ \widetilde{D} = \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & 2 & 0 \\\\ 0 & 0 & 3 \end{bmatrix}$$
那么
$$ \widetilde{D}^{-1} = \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & \frac{1}{2} & 0 \\\\ 0 & 0 & \frac{1}{3} \end{bmatrix}$$
其中$\widetilde{D}^{-1}为\widetilde{D}$的逆矩阵
GCN的输入是邻接矩阵A和节点特征H,直接做内积,再乘一个参数矩阵W,然后激活一下,不就相当于一个的神经网络层?为什么要有自连接的邻接矩阵?
提示:无法区分“自身节点”与“无连接节点”。只使用A的话,由于A的对角线上都是0,所以在和特征矩阵H相乘的时候,只会计算一个节点的所有邻居的特征的加权和,该节点本身的特征却被忽略了。
为什么需要有自连接的邻接矩阵的度矩阵?
提示:A是没有经过归一化的矩阵,这样与特征矩阵H相乘会改变特征原本的分布,所以对A做一个标准化处理。平衡度很大的节点的重要性。(对称归一化拉普拉斯矩阵)
$$ NormA_{ij}=\frac{A_{ij}}{\sqrt{d_{i}}\sqrt{d_{i}}} $$
GCN底层讲解
Data Handling of Graphs
edge_index = torch.tensor([[0,1,1,2],[1,0,2,1]],dtype = torch.long)
其中 [[0,1,1,2],[1,0,2,1]] 代表0节点到1节点,1到0、2节点,2到1节点
x = torch.tensor([[-1], [0], [1]], dtype=torch.float)
节点的特征分别为 -1、0、1
edge_index可写为以下两种格式
edge_index = torch.tensor([[0, 1, 1, 2], [1, 0, 2, 1]], dtype=torch.long)
edge_index = torch.tensor([[0, 1], [1, 0], [1, 2], [2, 1]], dtype=torch.long)
通过以下转换格式 edge_index=edge_index.t().contiguous()
底层代码讲解
- Add self-loops to the adjacency matrix. (A矩阵加上自连接矩阵)
- Linearly transform node feature matrix. ($H * W^{T}$)
- Compute normalization coefficients.
- Normalize node features
- Sum up neighboring node features (
"add"
aggregation).
import torch
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree
class GCNConv(MessagePassing):
def __init__(self, in_channels, out_channels):
super().__init__(aggr='add') # "Add" aggregation (Step 5).
self.lin = torch.nn.Linear(in_channels, out_channels)
def forward(self, x, edge_index):
# x has shape [N, in_channels]
# edge_index has shape [2, E]
# Step 1: Add self-loops to the adjacency matrix.
edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))
# Step 2: Linearly transform node feature matrix.
x = self.lin(x)
# Step 3: Compute normalization.
row, col = edge_index
deg = degree(col, x.size(0), dtype=x.dtype)
deg_inv_sqrt = deg.pow(-0.5)
deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
# The result is saved in the tensor norm of shape [num_edges, ]
norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]
# Step 4-5: Start propagating messages.
return self.propagate(edge_index, x=x, norm=norm)
def message(self, x_j, norm):
# x_j has shape [E, out_channels]
# Step 4: Normalize node features.
return norm.view(-1, 1) * x_j