Meta-GNN: On Few-shot Node Classification in Graph Meta-learning

2021-06-04

字数统计: 1.9k | 阅读时长≈ 8 分钟

https://dl.acm.org/doi/pdf/10.1145/3357384.3358106

Meta-GNN: On Few-shot Node Classification in Graph Meta-learning，CIKM，2019

总结：本篇文章非常简单，单纯的将MAML和GNN相结合解决图小样本节点分类问题。工作做得非常浅，论文正文只有3页多，但是胜在是第一篇将元学习拓展到图小样本学习领域的文章，所以文章也能发在CIKM上。另一点就是本文提供了实验源码可以参考一下。

1. 简介

1.1 摘要

Meta-learning has received a tremendous recent attention as a possible approach for mimicking human intelligence, i.e., acquiring new knowledge and skills with little or even no demonstration. Most of the existing meta-learning methods are proposed to tackle few-shot learning problems such as image and text, in rather Euclidean domain. However, there are very few works applying meta-learning to non-Euclidean domains, and the recently proposed graph neural networks (GNNs) models do not perform effectively on graph fewshot learning problems. Towards this, we propose a novel graph meta-learning framework – Meta-GNN – to tackle the few-shot node classification problem in graph meta-learning settings. It obtains the prior knowledge of classifiers by training on many similar few-shot learning tasks and then classifies the nodes from new classes with only few labeled samples. Additionally, Meta-GNN is a general model that can be straightforwardly incorporated into any existing state-of-the-art GNN. Our experiments conducted on three benchmark datasets demonstrate that our proposed approach not only improves the node classification performance by a large margin on few-shot learning problems in meta-learning paradigm, but also learns a more general and flexible model for task adaption.

元学习作为模拟人类学习提供了一种可能的方式，即只需要少量训练就能够学习到新知识，近年来吸引了很多关注。现有的元学习方法大多用于欧式空间比如图像、文本中的小样本问题，目前只有少数方法将元学习应用到非欧式空间的小样本问题中。另外，最近比较火的GNNs在小样本场景下性能较差。基于此，本\无作者提出了一种图元学习架构——Meta-GNN，来解决图元学习设定下的小样本节点分类问题。Meta-GNN利用在许多相似的小样本任务中学习先验知识，用于解决不可见类上的节点分类问题。另外，Meta-GNN是一个通用的模型，可以和任何现有的表现优良的GNN模型相结合。作者在三个标准数据集上的实验表明作者提出的方法不仅节点分类准确度高，还能学习一个更通用、灵活的任务自适应模型。

1.2 本文工作

本文提出的Meta-GNN架构是最早将元学习和GNNs相结合，用来解决图领域的小样本问题。

给定无向图 $\mathcal G=(V, E, \mathrm{~A}, \mathrm{X})$ ，其中 $V=\left\{v_{1}, v_{2}, \ldots, v_{i}, \ldots, v_{n}\right\}$ ， $E=\left\{e_{i, j}=\left(v_{i}, v_{j}\right)\right\} \subseteq(V \times V)$ ，邻接矩阵 $\mathrm{A} \in \mathbb{R}^{n \times n}$ ， $a_{ij}$ 表示节点 $v_i$ 和 $v_j$ 之间是否存在边， $X \in \mathbb{R}^{n \times a}$ 是特征矩阵， $x_i\in\mathbb R^d$ 表示节点 $v_i$ 的特征向量。问题定义

学习目标：学习一个分类器，可以使用与训练期间未见到的新类别，且每个新类别只有少量样本。
训练期间所有节点 $v_i$ 的类别都属于 $C_1$ ，模型学习一个分类器 $f_\theta$ 。测试阶段所有节点类别都属于 $C_2$ ， $C_2$ 和 $C_1$ 中的类别完全不同，分类器要尽可能准确的判别测试集中无标签节点的类别。
如果每个类别中有标签节点的数量为K，该任务就称之为 $|C_2|-way\ k-shot$ 学习问题，其中K的值很小。

2. 方法

通过GNN+MAML搭建Meta-GNN的框架，通过大量元训练任务使得模型能够快速适应只有少量有标签样本的新任务。

$f_\theta$ 表示Meta-GNN模型，参数为 $\theta$ ；
$\mathcal{D}_{\text {train }}=\left\{\left(x_{1}, y_{1}\right), \ldots,\left(x_{i}, y_{i}\right), \ldots,\left(x_{N}, y_{N}\right)\right\}$ 表示所有训练样本集合；
$\mathcal{D}_{\text {train }}: \mathcal{T}=\left\{\mathcal{T}_{1}, \mathcal{T}_{2}, \cdots, \mathcal{T}_{M}\right\}$ 表示M个元任务，每个任务的支持集 $\mathcal S_i$ 和查询集 $\mathcal Q_i$ 都采样自 $\mathcal{D}_{\text {train }}$ ；
支持集 $\mathcal S_i=\left\{v_{i 1}, v_{i 2}, \ldots, v_{i s}\right\}=\left\{\left(x_{i 1}, y_{i 1}\right),\left(x_{i 2}, y_{i 2}\right), \ldots,\left(x_{i s}, y_{i s}\right)\right\}$ ，其中 $s=\left|\mathcal{S}_{i}\right|$ ， $x_{is}$ 和 $y_{is}$ 分别表示节点 $v_{is}$ 的特征向量和标签。

Meta-GNN的整体框架如下图所示：

一、任务采样

$C_1$ 表示所有类别，从 $C_1$ 中采样 $|C_2|$ 个类别，然后每个类别再分别采样K个节点，整个流程如下所示：

(1) $C \leftarrow$ RANDOMSAMPLE $\left(C_{1},\left|C_{2}\right|\right)$ ;
(2) $\mathcal{S}_{i} \leftarrow$ RANDOMSAMPLE $\left(\mathcal{D}_{C}, K \times\left|C_{2}\right|\right)$
(3) $Q_{i} \leftarrow$ RANDOMSAMPLE $\left(\mathcal{D}_{C}-\mathcal{S}_{i}, P\right)$ ;
(4) $\mathcal{T}_{i}=\mathcal{S}_{i}+Q_{i}$
(5) Repeat step (1) - (4) for $M$ times;

二、元训练

分别使用采样到的元任务训练Meta-GNN，损失函数定义为交叉熵损失：

\mathcal{L}_{T_{l}}\left(f_{\theta}\right)=-\left(\sum_{\boldsymbol{x}_{i s}, y_{i s}} y_{i s} \log f_{\theta}\left(\boldsymbol{x}_{i s}\right)+\left(1-y_{i s}\right) \log \left(1-f_{\theta}\left(x_{i s}\right)\right)\right)

每个元任务上分别执行1次或多次如下的梯度更新：

\theta_{i}^{\prime}=\theta-\alpha_{1} \frac{\partial \mathcal{L}_{\mathcal{T}_{i}}\left(f_{\theta}\right)}{\partial \theta}

所有元任务都完成一次训练后通常可以称之为一次内循环，所有元任务上的训练目标是：

\theta=\underset{\theta}{\arg \min } \sum_{\mathcal{T}_{i} \sim p(\mathcal{T})} \mathcal{L}_{\mathcal{T}_{i}}\left(f_{\theta_{i}^{\prime}}\right)

所有元任务上元训练的整体目标优化采用SGC方法，模型参数更新方式如下：

\theta \leftarrow \theta-\alpha_{2} \frac{\partial \sum \mathcal{T}_{i \sim p(\mathcal{T})} \mathcal{L}_{\mathcal{T}_{i}}\left(f_{\theta_{i}^{\prime}}\right)}{\partial \theta}

三、元测试

将新的小样本元学习任务 $\mathcal T_{mt}$ 喂给Meta-GNN，使用查询集对Meta-GNN的参数进行微调，然后使用查询集对Meta-GNN进行评估即可。整个元训练和元测试的算法流程如下所示：

3. 实验

数据集采用：Cora、Citesser和Reddit，数据集划分如下：

其中 $|C_1|$ 表示用于元训练的类别数量， $|C_2|$ 表示用于元测试的类别数量。

作者发现在有标签样本很少的情况下，模型性能对节点的选取十分敏感，因此作者每个模型都运行50次计算平均性能。实验结果如下：

打赏

版权声明： 本博客所有文章除特别声明外，著作权归作者所有。转载请注明出处！