FCN全卷积网络–图像分割的开山之作

论文地址: https://arxiv.org/abs/1411.4038

随着CNN在图像识别中取得巨大成功,一些经典的图像分类网络(AlexNet、VGG、GoogLeNet、ResNet)也逐渐被应用于更加细分的视觉任务中。很多研究者也在探索如何将分类网络进行改造后用于语义分割的密集预测问题(dense predictions)。在更高效的语义分割网络提出之前,学术界用于密集预测任务的模型主要有以下几个特点:

(1)小模型。早期的网络结构受限于数据量和高性能的计算资源,在设计上一般不会使用过大的模型。

(2)分块训练。分块训练(patchwise training)在当时是图像训练的普遍做法,但该方法对于全卷积网络的训练会显得相对低效,但分块训练的优点在于能够规避类别不均衡问题,并且能够缓解密集分块的空间相关性问题。

(3)输入移位与输出交错。该方法可以视为一种输入与输出的变换方法,在OverFeat等结构中被广泛使用。

(4)后处理。对于神经网络输出质量不高的问题,对输出加后处理也是常见做法,常用的后处理方法包括超像素投影(superpixel projection)、随机场正则化(random field regularization)和图像滤波处理等。

可以看到,早期用于目标检测、关键点预测和语义分割等密集预测问题整体来看有两个缺陷,一是无法实现端到端(end-to-end)的流程,模型整体效率不佳;第二个则是不能做到真正的密集预测的特征:像素到像素(pixels-to-pixels)的预测。

全卷积网络(Fully Convolutional Networks, FCN)的提出,正好可以解决早期网络结构普遍存在的上述两个缺陷。FCN在2015年的一篇论文Fully Convolutional Networks for Semantic Segmentation中提出,其主要思路在于用卷积层代替此前分类网络中的全连接层,将全连接层的语义标签输出改为卷积层的语义热图(heatmap)输出,再结合上采样技术实现像素到像素的密集预测。如下图所示,上图为常见分类网络的流程,在五层卷积网络之后有三层全连接网络,最后输出一个包含类别语义信息的输出概率;下图为FCN网络流程,在上图分类网络的基础上,将最后三层全连接层改为卷积层,输出也相应的变为分类预测的热图,这样就为了最后的像素级的密集预测提供了基础。

所以,FCN实现密集预测的关键在于修改全连接层为卷积层,那么具体是如何修改的呢?先来详细分析一下的卷积层和全连接层的特征。卷积层与全连接层最大的区别在于卷积层每次计算时只与输入图像中一个具体的局部做运算,但二者都是做点积计算,其函数形式是类似的。假设给定在指定网络层任意坐标点(i,j)的数据向量Xij,而下一层对应坐标点的数据向量为Yij,有:

其中为卷积核大小或者权重向量长度,s为步长(stride),而f_ks则表示当前层到下一层的映射函数,f_ks既可以表示为卷积层又可以表示为全连接层,所以二者之间的转换是有理论基础的。

FCN分别在AlexNet、VGG和GoogLeNet上进行了全连接层转卷积层的修改,通过实验发现以VGG16作为主干网络效果最好,完整的FCN结构如下图所示,第一行最左边为原始输入图像,图像尺寸为32×32,conv为卷积层,pool为池化层,可以注意到conv6-7是最后的卷积层,此时得到的密集预测热图尺寸为输入图像的1/32,为了实现像素到像素的预测,还需要对热图进行上采样,FCN采用双线性插值(bilinear interpolation)进行上采样,所以这里需要将热图上采样32倍来恢复到原始图像的尺寸,因而第一行的网络结构也叫FCN-32s。直接进行32倍上采样得到的输出无疑是较为粗糙的,为了提高像素预测质量,FCN又分别有FCN-16s和FCN-8s的改进版本。图中第二行即为FCN-16s,主要区别在于先将conv7(1×1)的输出热图进行2倍上采样,然后将其与pool4(2×2)进行融合,最后对融合后的结果进行16倍上采样得到最终预测结果,同理FCN-8s将pool3(4×4)、2倍上采样后的pool4(4×4)以及4倍上采样的conv7(4×4)进行融合,最后再进行8倍的上采样得到语义分割图像。

所以,从FCN-32s到FCN-8s,其实一种粗分割到精细分割的演变过程,FCN通过融合浅层图像特征和深层卷积热图的方式来得到当时的SOTA(State of the art)水平的语义分割模型。下图是FCN-32s、FCN-16s和FCN-8s在同一张图像上的分割效果,与分割的标准图像(Ground truth)相比,可以看到三个模型的分割精度是在不断优化的。

下方代码给出FCN-8s的一个PyTorch简略实现方式,便于读者加深对FCN的理解。代码中对于卷积下采样使用了VGG16的预训练权重,分别构建了四个特征提取模块、一个卷积块和三个独立的卷积层。在前向传播流程中,将conv7、pool3和pool4进行融合,最后再做8倍的双线性插值上采样。

# 导入PyTorch相关模块
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import models

### 定义FCN-8s模型类
class FCN8(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        # 提取VGG16预训练权重作为特征
        feats =list(models.vgg16(pretrained=True).features.children())
        # 取前9层为第一特征模块
        self.feat1 = nn.Sequential(*feats[0:9])
        # 取第10-15层为第二特征模块
        self.feat2 = nn.Sequential(*feats[10:16])
        # 取第16-22层为第三特征模块
        self.feat3 = nn.Sequential(*feats[17:23])
        # 取后6层为第四特征模块
        self.feat4 = nn.Sequential(*feats[24:30])
        # 卷积层权重不参与训练更新
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                m.requires_grad = False
        # 定义卷积块
        self.conv_blocks = nn.Sequential(
            nn.Conv2d(512, 4096, 7),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Conv2d(4096, 4096, 1),
            nn.ReLU(inplace=True),
            nn.Dropout(),
        )
        # 改最后三层的全连接层为卷积层
        self.conv1 = nn.Conv2d(256, num_classes, 1)
        self.conv2 = nn.Conv2d(512, num_classes, 1)
        self.conv3 = nn.Conv2d(4096, num_classes, 1)

    ### 定义前向计算流程
    def forward(self, x):
        feat1 = self.feat1(x)
        feat2 = self.feat2(feat1)
        feat3 = self.feat3(feat2)
        feat4 = self.feat4(feat3)
        conv_blocks = self.conv_blocks(feat4)

        conv1 = self.conv1(feat2)
        conv2 = self.conv2(feat3)
        conv3 = self.conv3(conv_blocks)      
        outputs = F.upsample_bilinear(conv_blocks, conv2.size()[2:])
        # 第一次融合
        outputs += conv2
        outputs = F.upsample_bilinear(outputs, conv1.size()[2:])
        # 第二次融合
        outputs += conv1
        return F.upsample_bilinear(outputs, x.size()[2:]) 

FCN是深度学习语义分割网络的开山之作,在结构设计上率先将全卷积网络用于深度学习语义分割任务,在经典分类网络的基础上实现了像素到像素和端到端的分割。FCN整体上已具备编解码架构的U形网络雏形,为后续的网络设计开创了坚实的基础。

SUNet: Swin Transformer with UNet for Image Denoising

ISCAS 2022的一篇文章,作为首个Swin Transformer在图像去噪领域的应用,效果来说感觉还有很大提高空间。但不的不说,自从Swin Transformer(2021)提出后,在整个cv领域独领风骚。作为一个通用的架构,可以将其应用在各个cv领域,从paperwithcode里就可以见其影响力:(截止到22.8.28)

1、目标检测:

2、图像超分辨率

3、实例分割:

4、3D医学图像分割:

今天,就来看看Swin Transformer 对于图像去噪任务的处理效果:

个人觉得 Swin Transformer 对于去噪来说还有很大的扩展空间,这篇论文的模型效果不是很好,可以值得去尝试尝试,看看有没有更好的方法提高模型效果。

论文的主要贡献:

1、结合Unet网络+ Swin Transformer

2、提出了一个双上采样模块 dual up-sample block

3、首个将Swin +unet用于图像去噪领域

4、在 两个通用数据集中测试的结果还不错

网络结构:

网络分为三个部分:1)Shallow feature extraction; 2) UNet feature extraction; and
3) Reconstruction module

1、Shallow feature extraction

使用3*3卷积,提取特征,输出通道96

2、 UNet feature extraction

带有 Swin Transformer Block 的UNET体系结构,其中包含8个 Swin Transformer 层,以取代卷积。
Swin Transformer Block(STB)和Swin Transformer层(STL):

STB:包含8个STL

这块建议去看 Swin Transformer 论文,讲的比较清楚。注意此时的输入输出大小完全一致,因此需要下采样。

下采样: Patch merging

Patch merging:通过查看Patch merging的源码,可以看到,其实就是一个下采样的过程,它可以看成一种加权池化的过程。实现维度下采样、特征加倍的效果。

class PatchMerging(nn.Module):
    def __init__(self, input_resolution, dim, norm_layer=nn.LayerNorm):
        super().__init__()
        self.input_resolution = input_resolution
        self.dim = dim
        self.reduction = nn.Linear(4 * dim, 2 * dim, bias=False)
        self.norm = norm_layer(4 * dim)

    def forward(self, x):
        """
        x: B, H*W, C
        """
        H, W = self.input_resolution
        B, L, C = x.shape
        assert L == H * W, "input feature has wrong size"
        assert H % 2 == 0 and W % 2 == 0, f"x size ({H}*{W}) are not even."

        x = x.view(B, H, W, C)

        x0 = x[:, 0::2, 0::2, :]  # B H/2 W/2 C
        x1 = x[:, 1::2, 0::2, :]  # B H/2 W/2 C
        x2 = x[:, 0::2, 1::2, :]  # B H/2 W/2 C
        x3 = x[:, 1::2, 1::2, :]  # B H/2 W/2 C
        x = torch.cat([x0, x1, x2, x3], -1)  # B H/2 W/2 4*C
        x = x.view(B, -1, 4 * C)  # B H/2*W/2 4*C

        x = self.norm(x)
        x = self.reduction(x)

        return x

上采样:Dual up-sample

作者提出了 上采样,

该模块包括两种现有的上样本方法(即双线性和PixelShuffle),以防止棋盘伪影(Deconvolution and Checkerboard Artifacts中提出的)https://distill.pub/2016/deconv-checkerboard/ 产生原因:主要是出现在反卷积中。

上采样模块

通过两种上采样后,cat维度拼接后,通过一个卷积层将维度减半C/2

实验:

如上图所示。

Real-ESRGAN 超分辨网络

论文:Real-ESRGAN: TrainingReal-World Blind Super-Resolution with Pure Synthetic Data

代码:https://github.com/xinntao/Real-ESRGAN

Real-ESRGAN 的目标是开发出实用的图像/视频修复算法。
在 ESRGAN 的基础上使用纯合成的数据来进行训练,以使其能被应用于实际的图片修复的场景(顾名思义:Real-ESRGAN)。

  1. 目标:解决真实场景下的图像模糊问题。
  2. 数据集的构建:模糊核、噪声、尺寸缩小、压缩四种操作的随机顺序。
  3. 超分网络backbone:ESRGAN的生成网络+U-Net discriminator判别器。
  4. 损失函数:L1 loss,perceptual loss,生成对抗损失。
  5. 主要对比方法是:RealSR、ESRGAN、BSRGAN、DAN、CDC。

创新点

  1. 提出了新的构建数据集的方法,用高阶处理,增强降阶图像的复杂度。
  2. 构造数据集时引入sinc filter,解决了图像中的振铃和过冲现象。
  3. 替换原始ESRGAN中的VGG-discriminator,使用U-Net discriminator,以增强图像的对细节上的对抗学习。
  4. 引入spectral normalization以稳定由于复杂数据集和U-Net discriminator带来的训练不稳定情况。

数据集构建

在讨论数据集的构建前,作者详细讨论了造成图像模糊的原因,例如:年代久远的手机、传感器噪声、相机模糊、图像编辑、图像在网络中的传输、JPEG压缩以及其它噪声。原文如下:

For example, when we take a photo with our cellphones, the photos may have several degradations, such as camera blur, sensor noise, sharpening artifacts, and JPEG compression. We then do some editing and upload to a social media APP, which introduces further compression and unpredictable noises.

所以作者针对以上问题,提出了High-order降级模型。先面我们先介绍first-order降级模型,然后就很好理解High-order降级模型了。

First-order

First-order降级模型其实就是常规的降级模型,如上式所示,按顺序执行上述操作。

x代表降级后的图像,D代表降级函数,y代表原始图像;
k代表模糊核,r代表缩小比例,n代表加入的噪声,JPEG代表进行压缩。

每一种降级方法又有多种降级方案可以选择,如下图所示:

对于模糊核k,本方法使用各项同性(isotropic)和各向异性(anisotropic)的高斯模糊核。关于sinc filter会在下文中提到。

对于缩小操作r,常用的方法又双三次插值、双线性插值、区域插值—由于最近邻插值需要考虑对齐问题,所以不予以考虑。在执行缩小操作时,本方法从提到的3种插值方式中随机选择一种。

对于加入噪声操作n,本方法同时加入高斯噪声和服从泊松分布的噪声。同时,根据待超分图像的通道数,加入噪声的操作可以分为对彩色图像添加噪声和对灰度图像添加噪声。

JPEG压缩,本方法通过从[0, 100]范围中选择压缩质量,对图像进行JPEG压缩,其中0表示压缩后的质量最差,100表示压缩后的质量最好。JPEG压缩方法点此处

  • High-order

First-order由于使用相对单调的降级方法,其实很难模仿真实世界中的图像低分辨模糊情况。因此,作者提出的High-order其实是为了使用更复杂的降级方法,更好的模拟真实世界中的低分辨模糊情况,从而达到更好的学习效果。一阶降级模型构建的数据集训练结果如下:

高阶降级模型公式如下:

上式,其实就是对First-order进行多次重复操作,也就是每一个D都是执行一次完整的First-order降级。作者通过实验得出,当执行2次First-order时生成的数据集训练效果最好。所以,High-order的pipeline如下:

  • sinc filter

为了解决超分图像的振铃和过冲现象(振铃过冲在图像处理中很常见,此处不过多介绍),作者提出了在构建降级模型中增加sinc filter的操作。先来看一下振铃和过冲伪影的效果:

上图表示实际中的振铃和过冲伪影现象,下图表示通过对sinc filter设置不同的因子人工模仿的振铃和过冲伪影现象。过于如何构造sinc filter,详细细节建议看原文。

sinc filter在两个位置进行设置,一是在每一阶的模糊核k的处理中,也就是在各项同性和各项异性的高斯模糊之后,设置sinc filter;二是在最后一阶的JPEG压缩时,设置sinc filter,其中最后一阶的JPEG和sinc filter执行先后顺序是随机的。

网络结构

  • 生成网络

生成网络是ESRGAN的生成网络,基本没变,只是在功能上增加了对x2和x1倍的图像清晰度提升。对于x4倍的超分辨,网络完全按照ESRGAN的生成器执行;而对于X2和X1倍的超分辨,网络先进行pixel-unshuffle(pixel-shuffl的反操作,pixel-shuffle可理解为通过压缩图像通道而对图像尺寸进行放大),以降低图像分辨率为前提,对图像通道数进行扩充,然后将处理后的图像输入网络进行超分辨重建。举个例子:对于一幅图像,若只想进行x2倍放大变清晰,需先通过pixel-unshuffle进行2倍缩小,然后通过网络进行4倍放大。

对抗网络

由于使用的复杂的构建数据集的方式,所以需要使用更先进的判别器对生成图像进行判别。之前的ESRGAN的判别器更多的集中在图像的整体角度判别真伪,而使用U-Net 判别器可以在像素角度,对单个生成的像素进行真假判断,这能够在保证生成图像整体真实的情况下,注重生成图像细节。

  • 光谱标准正则化

通过加入这一操作,可以缓和由于复杂数据集合复杂网络带来的训练不稳定问题。

训练

分为两步:

  1. 先通过L1 loss,训练以PSRN为导向的网络,获得的模型称为Real-ESRNet。
  2. 以Real-ESRNet的网络参数进行网络初始化,损失函数设置为 L1 loss、perceptual loss、 GAN loss,训练最终的网络Real-ESRGAN。

ESRGAN图像超分辨

论文:https://arxiv.org/abs/1809.00219

github: https://github.com/xinntao/ESRGAN

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks发表于ECCV 2018 的 Workshops,在SRGAN的基础上进行了改进,包括改进网络的结构,判决器的判决形式,以及更换了一个用于计算感知域损失的预训练网络

超分辨率生成对抗网络(SRGAN)是一项开创性的工作,能够在单一图像超分辨率中生成逼真的纹理。这项工作发表于CVPR 2017,

文章链接:Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

但是,放大后的细节通常伴随着令人不快的伪影。为了更进一步地提升视觉质量,作者仔细研究了SRGAN的三个关键部分:1.网络结构 2.对抗性损失 3.感知域损失;并对每一项进行改进,得到ESRGAN。具体而言,文章提出了一种Residual-in-Residual Dense Block (RRDB)的网络单元,在这个单元中,去掉了BN(Batch Norm)层。此外,作者借鉴了relativistic GAN的想法,让判别器预测图像的真实性而不是图像“是否是fake图像”。最后,文章对感知域损失进行改进,使用激活前的特征,这样可以为亮度一致性和纹理恢复提供更强的监督。在这些改进的帮助下,ESRGAN得到了更好的视觉质量以及更逼真和自然的纹理。

在纹理和细节上,ESRGAN都优于SRGAN

SRGAN的思考与贡献

现有的超分辨率网络在不同的网络结构设计以及训练策略下,超分辨的效果得到了很大的提升,特别是PSNR指标。但是,基于PSNR指标的模型会倾向于生成过度平滑的结果,这些结果缺少必要的高频信息。PSNR指标与人类观察者的主观评价从根本上就不统一。

一些基于感知域信息驱动的方法已经提出来用于提升超分辨率结果的视觉质量。例如,感知域的损失函数提出来用于在特征空间(instead of 像素空间)中优化超分辨率模型;生成对抗网络通过鼓励网络生成一些更接近于自然图像的方法来提升超分辨率的质量;语义图像先验信息用于进一步改善恢复的纹理细节。

通过结合上面的方法,SRGAN模型极大地提升了超分辨率结果的视觉质量。但是SRGAN模型得到的图像和GT图像仍有很大的差距。

ESRGAN的改进

文章对这三点做出改进:1.网络的基本单元从基本的残差单元变为Residual-in-Residual Dense Block (RRDB);2.GAN网络改进为Relativistic average GAN (RaGAN);3.改进感知域损失函数,使用激活前的VGG特征,这个改进会提供更尖锐的边缘和更符合视觉的结果。

网络结构及思想

生成器部分

首先,作者参考SRResNet结构作为整体的网络结构,SRResNet的基本结构如下:

SRResNet基本结构

为了提升SRGAN重构的图像的质量,作者主要对生成器G做出如下改变:1.去掉所有的BN层;2.把原始的block变为Residual-in-Residual Dense Block (RRDB),这个block结合了多层的残差网络和密集连接。

如下图所示:

RRDB

思想:

BN层的影响

对于不同的基于PSNR的任务(包括超分辨率和去模糊)来说,去掉BN层已经被证明会提高表现和减小计算复杂度。BN层在训练时,使用一个batch的数据的均值和方差对该batch特征进行归一化,在测试时,使用在整个测试集上的数据预测的均值和方差。当训练集和测试集的统计量有很大不同的时候,BN层就会倾向于生成不好的伪影,并且限制模型的泛化能力。作者发现,BN层在网络比较深,而且在GAN框架下进行训练的时候,更会产生伪影。这些伪影偶尔出现在迭代和不同的设置中,违反了对训练稳定性能的需求。所以为了稳定的训练和一致的性能,作者去掉了BN层。此外,去掉BN层也能提高模型的泛化能力,减少计算复杂度和内存占用。

Trick:

除了上述的改进,作者也使用了一些技巧来训练深层网络:1.对残差信息进行scaling,即将残差信息乘以一个0到1之间的数,用于防止不稳定;2.更小的初始化,作者发现当初始化参数的方差变小时,残差结构更容易进行训练。

判别器部分

除了改进的生成器,作者也基于Relativistic GAN改进了判别器。判别器 D 使用的网络是 VGG 网络,SRGAN中的判别器D用于估计输入到判别器中的图像是真实且自然图像的概率,而Relativistic判别器则尝试估计真实图像相对来说比fake图像更逼真的概率。

具体而言,作者把标准的判别器换成Relativistic average Discriminator(RaD),所以判别器的损失函数定义为:

求均值的操作是通过对mini-batch中的所有数据求平均得到的,xf是原始低分辨图像经过生成器以后的图像。

可以观察到,对抗损失包含了xr和xf,所以这个生成器受益于对抗训练中的生成数据和实际数据的梯度,这种调整会使得网络学习到更尖锐的边缘和更细节的纹理。

感知域损失

文章也提出了一个更有效的感知域损失,使用激活前的特征(VGG16网络)。

感知域的损失当前是定义在一个预训练的深度网络的激活层,这一层中两个激活了的特征的距离会被最小化。与此相反,文章使用的特征是激活前的特征,这样会克服两个缺点。第一,激活后的特征是非常稀疏的,特别是在很深的网络中。这种稀疏的激活提供的监督效果是很弱的,会造成性能低下;第二,使用激活后的特征会导致重建图像与GT的亮度不一致。

使用激活前与激活后的特征的比较,(a)亮度;(b)细节

作者对使用的感知域损失进行了探索。与目前多数使用的用于图像分类的VGG网络构建的感知域损失相反,作者提出一种更适合于超分辨的感知域损失,这个损失基于一个用于材料识别的VGG16网络(MINCNet),这个网络更聚焦于纹理而不是物体。尽管这样带来的增益很小,但作者仍然相信,探索关注纹理的感知域损失对超分辨至关重要。

损失函数

经过上面对网络模块的定义和构建以后,再定义损失函数,就可以进行训练了。

对于生成器G,它的损失函数为:

代码解析:

https://zhuanlan.zhihu.com/p/54473407?utm_id=0

3.提取感知域损失的网络(Perceptual Network)

文章使用了一个用于材料识别的VGG16网络(MINCNet)来提取感知域特征,定义如下:

class MINCNet(nn.Module):
    def __init__(self):
        super(MINCNet, self).__init__()
        self.ReLU = nn.ReLU(True)
        self.conv11 = nn.Conv2d(3, 64, 3, 1, 1)
        self.conv12 = nn.Conv2d(64, 64, 3, 1, 1)
        self.maxpool1 = nn.MaxPool2d(2, stride=2, padding=0, ceil_mode=True)
        self.conv21 = nn.Conv2d(64, 128, 3, 1, 1)
        self.conv22 = nn.Conv2d(128, 128, 3, 1, 1)
        self.maxpool2 = nn.MaxPool2d(2, stride=2, padding=0, ceil_mode=True)
        self.conv31 = nn.Conv2d(128, 256, 3, 1, 1)
        self.conv32 = nn.Conv2d(256, 256, 3, 1, 1)
        self.conv33 = nn.Conv2d(256, 256, 3, 1, 1)
        self.maxpool3 = nn.MaxPool2d(2, stride=2, padding=0, ceil_mode=True)
        self.conv41 = nn.Conv2d(256, 512, 3, 1, 1)
        self.conv42 = nn.Conv2d(512, 512, 3, 1, 1)
        self.conv43 = nn.Conv2d(512, 512, 3, 1, 1)
        self.maxpool4 = nn.MaxPool2d(2, stride=2, padding=0, ceil_mode=True)
        self.conv51 = nn.Conv2d(512, 512, 3, 1, 1)
        self.conv52 = nn.Conv2d(512, 512, 3, 1, 1)
        self.conv53 = nn.Conv2d(512, 512, 3, 1, 1)

    def forward(self, x):
        out = self.ReLU(self.conv11(x))
        out = self.ReLU(self.conv12(out))
        out = self.maxpool1(out)
        out = self.ReLU(self.conv21(out))
        out = self.ReLU(self.conv22(out))
        out = self.maxpool2(out)
        out = self.ReLU(self.conv31(out))
        out = self.ReLU(self.conv32(out))
        out = self.ReLU(self.conv33(out))
        out = self.maxpool3(out)
        out = self.ReLU(self.conv41(out))
        out = self.ReLU(self.conv42(out))
        out = self.ReLU(self.conv43(out))
        out = self.maxpool4(out)
        out = self.ReLU(self.conv51(out))
        out = self.ReLU(self.conv52(out))
        out = self.conv53(out)
        return out

再引入预训练参数,就可以进行特征提取:

class MINCFeatureExtractor(nn.Module):
    def __init__(self, feature_layer=34, use_bn=False, use_input_norm=True, \
                device=torch.device('cpu')):
        super(MINCFeatureExtractor, self).__init__()

        self.features = MINCNet()
        self.features.load_state_dict(
            torch.load('../experiments/pretrained_models/VGG16minc_53.pth'), strict=True)
        self.features.eval()
        # No need to BP to variable
        for k, v in self.features.named_parameters():
            v.requires_grad = False

    def forward(self, x):
        output = self.features(x)
        return output

网络插值思想

为了平衡感知质量和PSNR等评价值,作者提出了一个灵活且有效的方法—网络插值。具体而言,作者首先基于PSNR方法训练的得到的网络G_PSNR,然后再用基于GAN的网络G_GAN进行finetune。

然后,对这两个网络相应的网络参数进行插值得到一个插值后的网络G_INTERP:

这样就可以通过 α 值来调整效果

训练细节

训练细节

放大倍数:4,mini-batch:16

通过Matlab的bicubic函数对HR图像进行降采样得到LR图像。

HR patch大小:128×128(实验发现使用大的patch时,训练一个深层网络效果会更好,因为一个增大的感受域会帮助模型捕捉更具有语义的信息)

训练过程:

1.训练一个基于PSNR指标的模型(L1 Loss)

初始化学习率:2×1e-4

每200000个mini-batch学习率除以2

2.以1中训练的模型作为生成器的初始化

λ=5×10−3,η=0.01,β=0.2 (残差scaling系数)

初始学习率:1e-4,并在50k,100k,200k,300k迭代后减半。

一个基于像素损失函数进行优化的预训练模型会帮助基于GAN的模型生成更符合视觉的结果,原因如下:1.可以避免生成器不希望的局部最优;2.再预训练以后,判别器所得到的输入图像的质量是相对较好的,而不是完全初始化的图像,这样会使判别器更关注到纹理的判别。

优化器:Adam( β1=0.9,β2=0.999 );交替更新生成器和判别器,直到收敛。

生成器的设置:1.16层(基本的残差结构) 2.23层(RDDB)

数据集:DIV2K,Flickr2K,OST;(有丰富纹理信息的数据集会是模型产生更自然的结果)

可以看到,ESRGAN得到的图像PSNR值不高,但是从视觉效果上看会更好,Percpetual Index值更小(越小越好),而且ESRGAN在 PIRM-SR 竞赛上也获得了第一名(在Percpetual Index指标上)。

经过实验以后,作者得出结论:

1.去掉BN:并没有降低网络的性能,而且节省了计算资源和内存占用。而且发现当网络变深变复杂时,带BN层的模型更倾向于产生影响视觉效果的伪影。

2.使用激活前的特征:得到 的图像的亮度更准确,而且可以产生更尖锐的边缘和更丰富的细节。

3.RaGAN:产生更尖锐的边缘和更丰富的细节。

4.RDDB:更加提升恢复得到的纹理(因为深度模型具有强大的表示能力来捕获语义信息),而且可以去除噪声。

网络插值实验

为了平衡视觉效果和PSNR等性能指标,作者对网络插值参数 α 的取值进行了实验,结果如下:

总结

文章提出的ESRGAN在SRGAN的基础上做出了改进,包括去除BN层,基本结构换成RDDB,改进GAN中判别器的判别目标,以及使用激活前的特征构成感知域损失函数,实验证明这些改进对提升输出图像的视觉效果都有作用。此外,作者也使用了一些技巧来提升网络的性能,包括对残差信息的scaling,以及更小的初始化。最后,作者使用了一种网络插值的方法来平衡输出图像的视觉效果和PSNR等指标值。

逼真度超越「AI设计师」DALL·E 2!谷歌大脑推出新的文本生成图像模型——Imagen

论文: https://arxiv.org/abs/2205.11487

demo地址:https://imagen.research.google/

文本生成图像模型界又出新手笔!

这次的主角是Google Brain推出的 Imagen,再一次突破人类想象力,将文本生成图像的逼真度和语言理解提高到了前所未有的新高度!比前段时间OpeAI家的DALL·E 2更强!

话不多说,我们来欣赏这位AI画师的杰作~A brain riding a rocketship heading towards the moon.(一颗大脑乘着火箭飞向月球。)


Imagen的工作原理

本方案的主要内容包括三部分,如下图所示,首先是文本编码器部分,本文直接使用的是T5,然后是Diffusion生成模型,这部分与Glide类似,都是使用classifier-free引导的方式。最后就是对生成的小图进行超分,变为大图。下面分模块详细介绍:

2.1. text编码部分

文本编码器部分对比了BERT(base模型参数量:1.1亿)、CLIP(0.63亿)以及T5(模型参数量:110亿),后来发现T5效果最好。并且还舍弃了之前常规的基于<text, image>数据对,对Text Encoder进行finetune的流程。理由个很直接,因为参数量大好几个量级,不需要finetune。

2.2. Diffusion生成部分

这部分跟Glide中的基本相近,可以直接与Glide文章中对eps建模公式进行对比。只是在uncondition的时候没有使用空的文本表示。

text condition diffusion model using classifier-free guidance

classifier-free应该是diffusion必备的优化方式了。融合text特征到生成模型中的部分也可以直接看Glide。这部分的模型还是典型的64*64的U-Net结构,如下图2所示。之所以选择小模型主要还是diffusion的迭代过程太长,导致生成过程慢,所以生成小图是提速最方便的,但是也注定了无法生成比较复杂内容和空间关系的大图。UNet网络由左编码部分,右解码部分和下两个卷积+激活层组成。

编码部分:左边红框架构中是由4个重复结构组成:2个3×3卷积层,非线形ReLU层和一个stride为2的2×2 max pooling层。每一次下采样特征通道的数量加倍。

解码部分:右边蓝框,反卷积也有4个重复结构组成。每个重复结构前先使用反卷积,每次反卷积后特征通道数量减半,特征图大小加倍反卷积之后,反卷积的结果和编码部分对应步骤的特征图拼接起来。拼接后的特征图再进行2次3×3的卷积,最后一层的卷积核为1×1 的卷积核,将64通道的特征图转化为特定类别数量的结果

2.3. Diffusion超分部分

超分的好处是可以直接带来效率的提高,但是可能会影响最终生成的细节失真,本文在本文提到通过噪声的增强,可以提升模型在控制失真上鲁棒性,具体原理还是要详细看论文了。这部分模型使用的是U-Net的变体Efficient U-Net模型,有点就是提升记忆感知、推理效率以及训练收敛速度。

大型预训练语言模型×级联扩散模型

Imagen使用在纯文本语料中进行预训练的通用大型语言模型(例如T5),它能够非常有效地将文本合成图像:在Imagen中增加语言模型的大小,而不是增加图像扩散模型的大小,可以大大地提高样本保真度和图像-文本对齐。

Imagen的研究突出体现在:

  • 大型预训练冻结文本编码器对于文本到图像的任务来说非常有效;
  • 缩放预训练的文本编码器大小比缩放扩散模型大小更重要;
  • 引入一种新的阈值扩散采样器,这种采样器可以使用非常大的无分类器指导权重;
  • 引入一种新的高效U-Net架构,这种架构具有更高的计算效率、更高的内存效率和更快的收敛速度;
  • Imagen在COCO数据集上获得了最先进的FID分数7.27,而没有对COCO进行任何训练,人类评分者发现,Imagen样本在图像-文本对齐方面与COCO数据本身不相上下。

2
引入新基准DrawBench

为了更深入地评估文本到图像模型,Google Brain 引入了DrawBench,这是一个全面的、具有挑战性的文本到图像模型基准。通过DrawBench,他们比较了Imagen与VQ-GAN+CLIP、Latent Diffusion Models和DALL-E 2等其他方法,发现人类评分者在比较中更喜欢Imagen而不是其他模型,无论是在样本质量上还是在图像-文本对齐方面。

  • 并排人类评估;
  • 对语意合成性、基数性、空间关系、长文本、生词和具有挑战性的提示几方面提出了系统化的考验;
  • 由于图像-文本对齐和图像保真度的优势,相对于其他方法,用户强烈倾向于使用Imagen。

3 打开了潘多拉魔盒?

像Imagen这样从文本生成图像的研究面临着一系列伦理挑战。

首先,文本-图像模型的下游应用多种多样,可能会从多方面对社会造成影响。Imagen以及一切从文本生成图像的系统都有可能被误用的潜在风险,因此社会要求开发方提供负责任的开源代码和演示。基于以上原因,Google决定暂时不发布代码或进行公开演示。而在未来的工作中,Google将探索一个负责任的外部化框架,从而将各类潜在风险最小化。

其次,文本到图像模型对数据的要求导致研究人员严重依赖于大型的、大部分未经整理的、网络抓取的数据集。虽然近年来这种方法使算法快速进步,但这种性质的数据集往往会夹带社会刻板印象、压迫性观点、对边缘群体有所贬损等“有毒”信息。

为了去除噪音和不良内容(如色情图像和“有毒”言论),Google对训练数据的子集进行了过滤,同时Google还使用了众所周知的LAION-400M数据集进行过滤对比,该数据集包含网络上常见的不当内容,包括色情图像、种族主义攻击言论和负面社会刻板印象。Imagen依赖于在未经策划的网络规模数据上训练的文本编码器,因此继承了大型语言模型的社会偏见和局限性。这说明Imagen可能存在负面刻板印象和其他局限性,因此Google决定,在没有进一步安全措施的情况下,不会将Imagen发布给公众使用。

OpenAI 发布文字生成图像工具 DALL·E 2

paper: https://arxiv.org/abs/2204.06125

官方地址: https://openai.com/dall-e-2/

论文讲解:DALL·E 2【论文精读】

Dall-mini:一个小型的Dall开源库

提供一个api: https://www.craiyon.com/

DALL·E 2 能做什么:

官网

DALL-E 2不仅能按用户指令生成明明魔幻,却又看着十分合理不明觉厉的图片。作为一款强大的模型,目前我们已知DALL-E 2还可以:

  • 生成特定艺术风格的图像,仿佛出自该种艺术风格的画家之手,十分原汁原味!
  • 保持一张图片显着特征的情况下,生成该图片的多种变体,每一种看起来都十分自然;
  • 修改现有图像而不露一点痕迹,天衣无缝。

1、根据文字生成图片(概念 属性 风格):

An astronaut riding a horse in a photorealistic style
A bowl of soup that is a portal to another dimension as digital art

2、 DALL·E 2 can take an image and create different
variations of it inspired by the original.(输入图片,生成跟原始图片相似的图片)

DALL-E 2目前曝光的功能令人瞠目结舌,不禁激起了众多AI爱好者的讨论,这样一个强大模型,它的工作原理到底是什么?!

3、DALL·E 2 can make realistic edits to existing images from a natural language caption. It can add and remove elements while taking shadows, reflections, and textures into account.

·

1、工作原理:简单粗暴

图源:https://arxiv.org/abs/2204.0612

针对图片生成这一功能来说,DALL-E 2的工作原理剖析出来看似并不复杂:

  1. 首先,将文本提示输入文本编码器,该训练过的编码器便将文本提示映射到表示空间。
  2. 接下来,称为先验的模型将文本编码映射到相应的图像编码,图像编码捕获文本编码中包含的提示的语义信息。
  3. 最后,图像解码模型随机生成一幅从视觉上表现该语义信息的图像。

2、工作细节:处处皆奥妙

可是以上步骤说起来简单,分开看来却是每一步都有很大难度,让我们来模拟DALL-E 2的工作流程,看看究竟每一步都是怎么走通的。

我们的第一步是先看看DALL-E 2是怎么学习把文本和视觉图像联系起来的。

第一步 – 把文本和视觉图像联系起来

输入“泰迪熊在时代广场滑滑板”的文字提示后,DALL-E 2生成了下图:

DALL-E 2是怎么知道“泰迪熊”这个文本概念在视觉空间里是什么样子的?

其实DALL-E 2中的文本语义和与其相对的视觉图片之间的联系,是由另一个OpenAI模型CLIP(Contrastive Language-Image Pre-training)学习的。

CLIP接受过数亿张图片及其相关文字的训练,学习到了给定文本片段与图像的关联。

也就是说,CLIP并不是试图预测给定图像的对应文字说明,而是只学习任何给定文本与图像之间的关联。CLIP做的是对比性而非预测性的工作。

整个DALL-E 2模型依赖于CLIP从自然语言学习语义的能力,所以让我们看看如何训练CLIP来理解其内部工作。

CLIP训练

训练CLIP的基本原则非常简单:

  1. 首先,所有图像及其相关文字说明都通过各自的编码器,将所有对象映射到m维空间。
  2. 然后,计算每个(图像,文本)对的cos值相似度。
  3. 训练目标是使N对正确编码的图像/标题对之间的cos值相似度最大化,同时使N2 – N对错误编码的图像/标题对之间的cos值相似度最小化。
CLIP训练流程

CLIP对DALL-E 2的意义

CLIP几乎就是DALL-E 2的心脏,因为CLIP才是那个把自然语言片段与视觉概念在语义上进行关联的存在,这对于生成与文本对应的图像来说至关重要。

第二步 – 从视觉语义生成图像

训练结束后,CLIP模型被冻结,DALL-E 2进入下一个任务——学习怎么把CLIP刚刚学习到的图像编码映射反转。CLIP学习了一个表示空间,在这个表示空间当中很容易确定文本编码和视觉编码的相关性, 我们需要学会利用表示空间来完成反转图像编码映射这个任务。

而OpenAI使用了它之前的另一个模型GLIDE的修改版本来执行图像生成。GLIDE模型学习反转图像编码过程,以便随机解码CLIP图像嵌入。

“一只吹喷火喇叭的柯基”一图经过CLIP的图片编码器,GLIDE利用这种编码生成保持原图像显着特征的新图像。 图源:https://arxiv.org/abs/2204.06125

如上图所示,需要注意的是,我们的目标不是构建一个自编码器并在给定的嵌入条件下精确地重建图像,而是在给定的嵌入条件下生成一个保持原始图像显着特征的图像。为了进行图像生成,GLIDE使用了扩散模型(Diffusion Model)。

何为扩散模型?

扩散模型是一项受热力学启发的发明,近年来越来越受到学界欢迎。扩散模型学习通过逆转一个逐渐噪声过程来生成数据。如下图所示,噪声处理过程被视为一个参数化的马尔可夫链,它逐渐向图像添加噪声使其被破坏,最终(渐近地)导致纯高斯噪声。扩散模型学习沿着这条链向后走去,在一系列步骤中逐渐去除噪声,以逆转这一过程。

扩散模型示意图 图源:https://arxiv.org/pdf/2006.11239.pdf

如果训练后将扩散模型“切成两半”,则可以通过随机采样高斯噪声来生成图像,然后对其去噪,生成逼真的图像。大家可能会意识到这种技术很容易令人联想到用自编码器生成数据,实际上扩散模型和自编码器确实是相关的。

GLIDE的训练

虽然GLIDE不是第一个扩散模型,但其重要贡献在于对模型进行了修改,使其能够生成有文本条件的图像。

GLIDE扩展了扩散模型的核心概念,通过增加额外的文本信息来增强训练过程,最终生成文本条件图像。让我们来看看GLIDE的训练流程:

动图封面

下面是一些使用GLIDE生成的图像示例。作者指出,就照片真实感和文本相似度两方面而言,GLIDE的表现优于DALL-E(1)。

DALL-E 2使用了一种改进的GLIDE模型,这种模型以两种方式使用投影的CLIP文本嵌入。第一种方法是将它们添加到GLIDE现有的时间步嵌入中,第二种方法是创建四个额外的上下文标记,这些标记连接到GLIDE文本编码器的输出序列。

GLIDE对于DALL-E 2的意义

GLIDE对于DALL-E 2亦很重要,因为GLIDE能够将自己按照文本生成逼真图像的功能移植到DALL-E 2上去,而无需在表示空间中设置图像编码。因此,DALL-E 2使用的修改版本GLIDE学习的是根据CLIP图像编码生成语义一致的图像。

第三步 – 从文本语义到相应的视觉语义的映射

到了这步,我们如何将文字提示中的文本条件信息注入到图像生成过程中?

回想一下,除了图像编码器,CLIP还学习了文本编码器。DALL-E 2使用了另一种模型,作者称之为先验模型,以便从图像标题的文本编码映射到对应图像的图像编码。DALL-E 2的作者用自回归模型和扩散模型进行了实验,但最终发现它们的性能相差无几。考虑到扩散模型的计算效率更高,因此选择扩散模型作为 DALL-E 2的先验。

从文本编码到相应图像编码的先验映射 修改自图源:https://arxiv.org/abs/2204.06125

先验训练

DALL-E 2中扩散先验的运行顺序是:

  1. 标记化的文本;
  2. 这些标记的CLIP文本编码;
  3. 扩散时间步的编码;
  4. 噪声图像通过CLIP图像编码器;
  5. Transformer输出的最终编码用于预测无噪声CLIP图像编码。

第四步 – 万事俱备

现在,我们已经拥有了DALL-E 2的所有“零件”,万事俱备,只需要将它们组合在一起就可以获得我们想要的结果——生成与文本指示相对应的图像:

  1. 首先,CLIP文本编码器将图像描述映射到表示空间;
  2. 然后扩散先验从CLIP文本编码映射到相应的CLIP图像编码;
  3. 最后,修改版的GLIDE生成模型通过反向扩散从表示空间映射到图像空间,生成众多可能图像中的一个。
DALL-E 2图像生成流程的高级概述 修改自图源:https://arxiv.org/abs/2204.06125

以上就是DALL-E 2的工作原理啦~

希望大家能注意到DALL-E 2开发的3个关键要点

  • DALL-E 2体现了扩散模型在深度学习中的能力,DALL-E 2中的先验子模型和图像生成子模型都是基于扩散模型的。虽然扩散模型只是在过去几年才流行起来,但其已经证明了自己的价值,我们可以期待在未来的各种研究中看到更多的扩散模型~
  • 第二点是我们应看到使用自然语言作为一种手段来训练最先进的深度学习模型的必要性与强大力量。DALL-E 2的强劲功能究其根本还是来自于互联网上提供的绝对海量的自然语言&图像数据对。使用这些数据不仅消除了人工标记数据集这一费力的过程所带来的发展瓶颈;这些数据的嘈杂、未经整理的性质也更加反映出深度学习模型必须对真实世界的数据具有鲁棒性。
  • 最后,DALL-E 2重申了Transformer作为基于网络规模数据集训练的模型中的最高地位,因为Transformer的并行性令人印象十分深刻。

ELECTRA: 超越BERT, 19年最佳NLP预训练模型

ELECTRA的全称是Efficiently Learning an Encoder that Classifies Token Replacements Accurately

Github: https://github.com/ymcui/Chinese-ELECTRA

ELECTRA : https://arxiv.org/abs/2003.10555

右边的图是左边的放大版,纵轴是GLUE分数,横轴是FLOPs (floating point operations),Tensorflow中提供的浮点数计算量统计。从上图可以看到,同等量级的ELECTRA是一直碾压BERT的,而且在训练更长的步数之后,达到了当时的SOTA模型——RoBERTa的效果。从左图曲线上也可以看到,ELECTRA效果还有继续上升的空间

2. 模型结构

NLP式的Generator-Discriminator

ELECTRA最主要的贡献是提出了新的预训练任务和框架,把生成式的Masked language model(MLM)预训练任务改成了判别式的Replaced token detection(RTD)任务,判断当前token是否被语言模型替换过。那么问题来了,我随机替换一些输入中的字词,再让BERT去预测是否替换过可以吗?可以的,因为我就这么做过,但效果并不好,因为随机替换太简单了

那怎样使任务复杂化呢?。。。咦,咱们不是有预训练一个MLM模型吗?

于是作者就干脆使用一个MLM的G-BERT来对输入句子进行更改,然后丢给D-BERT去判断哪个字被改过,如下:

于是,我们NLPer终于成功地把CV的GAN拿过来了!

Replaced Token Detection

但上述结构有个问题,输入句子经过生成器,输出改写过的句子,因为句子的字词是离散的,所以梯度在这里就断了,判别器的梯度无法传给生成器,于是生成器的训练目标还是MLM(作者在后文也验证了这种方法更好),判别器的目标是序列标注(判断每个token是真是假),两者同时训练,但判别器的梯度不会传给生成器,目标函数如下:

因为判别器的任务相对来说容易些,RTD loss相对MLM loss会很小,因此加上一个系数,作者训练时使用了50。

另外要注意的一点是,在优化判别器时计算了所有token上的loss,而以往计算BERT的MLM loss时会忽略没被mask的token。作者在后来的实验中也验证了在所有token上进行loss计算会提升效率和效果。

事实上,ELECTRA使用的Generator-Discriminator架构与GAN还是有不少差别,作者列出了如下几点:

3. 实验及结论

创新总是不易的,有了上述思想之后,可以看到作者进行了大量的实验,来验证模型结构、参数、训练方式的效果。

Weight Sharing

生成器和判别器的权重共享是否可以提升效果呢?作者设置了相同大小的生成器和判别器,在不共享权重下的效果是83.6,只共享token embedding层的效果是84.3,共享所有权重的效果是84.4。作者认为生成器对embedding有更好的学习能力,因为在计算MLM时,softmax是建立在所有vocab上的,之后反向传播时会更新所有embedding,而判别器只会更新输入的token embedding。最后作者只使用了embedding sharing。

Smaller Generators

从权重共享的实验中看到,生成器和判别器只需要共享embedding的权重就足矣了,那这样的话是否可以缩小生成器的尺寸进行训练效率提升呢?作者在保持原有hidden size的设置下减少了层数,得到了下图所示的关系图:

可以看到,生成器的大小在判别器的1/4到1/2之间效果是最好的。作者认为原因是过强的生成器会增大判别器的难度(判别器:小一点吧,我太难了)。

Training Algorithms

实际上除了MLM loss,作者也尝试了另外两种训练策略:

  1. Adversarial Contrastive Estimation:ELECTRA因为上述一些问题无法使用GAN,但也可以以一种对抗学习的思想来训练。作者将生成器的目标函数由最小化MLM loss换成了最大化判别器在被替换token上的RTD loss。但还有一个问题,就是新的生成器loss无法用梯度上升更新生成器,于是作者用强化学习Policy Gradient的思想,最终优化下来生成器在MLM任务上可以达到54%的准确率,而之前MLE优化下可以达到65%。(感谢 @阿雪我要 勘误)
  2. Two-stage training:即先训练生成器,然后freeze掉,用生成器的权重初始化判别器,再接着训练相同步数的判别器。

对比三种训练策略,得到下图:

可见“隔离式”的训练策略效果还是最好的,而两段式的训练虽然弱一些,作者猜测是生成器太强了导致判别任务难度增大,但最终效果也比BERT本身要强,进一步证明了判别式预训练的效果。

Small model? Big model?

这两节真是吊打之前的模型,作者重申了他的主要目的是提升预训练效率,于是做了GPU单卡就可以愉快训练的ELECTRA-Small和BERT-Small,接着和尺寸不变的ELMo、GPT等进行对比,结果如下:

数据简直优秀,仅用14M参数量,以前13%的体积,在提升了训练速度的同时还提升了效果,这里我疯狂点赞。

小ELECTRA的本事我们见过了,那大ELECTRA行吗?直接上图:

上面是各个模型在GLUE dev/text上的表现,可以看到ELECTRA仅用了1/4的计算量就达到了RoBERTa的效果。而且作者使用的是XLNet的语料,大约是126G,但RoBERTa用了160G。由于时间和精力问题,作者们没有把ELECTRA训练更久(应该会有提升),也没有使用各种榜单Trick,所以真正的GLUE test上表现一般(现在的T5是89.7,RoBERTa是88.5,没看到ELECTRA)。

Efficiency Analysis

前文中提到了,BERT的loss只计算被替换的15%个token,而ELECTRA是全部都计算的,所以作者又做了几个实验,探究哪种方式更好一些:

  1. ELECTRA 15%:让判别器只计算15% token上的损失
  2. Replace MLM:训练BERT MLM,输入不用[MASK]进行替换,而是其他生成器。这样可以消除这种pretrain-finetune直接的diff。
  3. All-Tokens MLM:接着用Replace MLM,只不过BERT的目标函数变为预测所有的token,比较接近ELECTRA。

三种实验结果如下:

可以看到:

  1. 对比ELECTRA和ELECTRA 15%:在所有token上计算loss确实能提升效果
  2. 对比Replace MLM和BERT:[MASK]标志确实会对BERT产生影响,而且BERT目前还有一个trick,就是被替换的10%情况下使用原token或其他token,如果没有这个trick估计效果会差一些。
  3. 对比All-Tokens MLM和BERT:如果BERT预测所有token 的话,效果会接近ELECTRA

另外,作者还发现,ELECTRA体积越小,相比于BERT就提升的越明显,说明fully trained的ELECTRA效果会更好。另外作者推断,由于ELECTRA是判别式任务,不用对整个数据分布建模,所以更parameter-efficient

4. 总结

无意中发现了这篇还在ICLR盲审的ELECTRA,读完摘要就觉得发现了新大陆,主要是自己也试过Replaced Token Detection这个任务,因为平时任务效果的分析和不久前看的一篇文章,让我深刻感受到了BERT虽然对上下文有很强的编码能力,却缺乏细粒度语义的表示,我用一张图表示大家就明白了:

这是把token编码降维后的效果,可以看到sky和sea明明是天与海的区别,却因为上下文一样而得到了极为相似的编码。细粒度表示能力的缺失会对真实任务造成很大影响,如果被针对性攻击的话更是无力,所以当时就想办法加上更细粒度的任务让BERT去区分每个token,不过同句内随机替换的效果并不好。相信这个任务很多人都想到过,不过都没有探索这么深入,这也告诫我们,idea遍地都是,往下挖才能有SOTA。

ELECTRA是BERT推出这一年来我见过最赞的idea,它不仅提出了能打败MLM的预训练任务,更推出了一种十分适用于NLP的类GAN框架。毕竟GAN太牛逼了,看到deepfake的时候我就想,什么时候我们也能deepcheat,但听说GAN在NLP上的效果一直不太好,这次ELECTRA虽然只用了判别器,但个人认为也在一定程度上打开了潘多拉魔盒。

算法工程师面试知识点整理:

https://mp.weixin.qq.com/s/nPVbgOBOPs5VjW6_U-Om3w

Transformer–Attention Is All You Need

transformerr特点:

·是一个encoder-decoder模型

·非RNN模型

·完全基于全连接和注意力

·性能远超RNN(大数据集)

回忆seq-seq模型:

如何求c:

如何从RNN到transformer:自注意力层

在self-attention中,每个单词有3个不同的向量,它们分别是Query向量( Q ),Key向量( K )和Value向量( V ),长度均是64。它们是通过3个不同的权值矩阵由嵌入向量 X 乘以三个不同的权值矩阵 WQ , WK , WV 得到,其中三个矩阵的尺寸也是相同的。均是 512×64 。

总结为如下图所示的矩阵形式:

搭建transfomer:多头自注意力层

上面给出的是一个自注意力层,我们使用N个相同的层,并行,不同注意力层不共享参数。将多头的输出进行堆叠作为多头注意力层的输出。

Stacked Self-Attention Layers

一个encoder block:

最终 堆叠6个:作为transfomer encoder:

decoder部分:

encoder block:

整体网络:

Few-Shot Papers–小样本学习论文汇总

来自GitHub仓库:https://github.com/tata1661/FSL-Mate/tree/master/FewShotPapers

This repository contains few-shot learning (FSL) papers mentioned in our FSL survey published in ACM Computing Surveys (JCR Q1, CORE A*).

For convenience, we also include public implementations of respective authors.

We will update this paper list to include new FSL papers periodically.

Citation

Please cite our paper if you find it helpful.

@article{wang2020generalizing,
  title={Generalizing from a few examples: A survey on few-shot learning},
  author={Wang, Yaqing and Yao, Quanming and Kwok, James T and Ni, Lionel M},
  journal={ACM Computing Surveys},
  volume={53},
  number={3},
  pages={1--34},
  year={2020},
  publisher={ACM New York, NY, USA}
}

Content

  1. Survey
  2. Data
  3. Model
    1. Multitask Learning
    2. Embedding/Metric Learning
    3. Learning with External Memory
    4. Generative Modeling
  4. Algorithm
    1. Refining Existing Parameters
    2. Refining Meta-learned Parameters
    3. Learning Search Steps
  5. Applications
    1. Computer Vision
    2. Robotics
    3. Natural Language Processing
    4. Acoustic Signal Processing
    5. Recommendation
    6. Others
  6. Theories
  7. Few-shot Learning and Zero-shot Learning
  8. Variants of Few-shot Learning
  9. Datasets/Benchmarks
  10. Software Library

Survey

  1. Generalizing from a few examples: A survey on few-shot learning, CSUR, 2020 Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni. paper arXiv

Data

  1. Learning from one example through shared densities on transforms, in CVPR, 2000. E. G. Miller, N. E. Matsakis, and P. A. Viola. paper
  2. Domain-adaptive discriminative one-shot learning of gestures, in ECCV, 2014. T. Pfister, J. Charles, and A. Zisserman. paper
  3. One-shot learning of scene locations via feature trajectory transfer, in CVPR, 2016. R. Kwitt, S. Hegenbart, and M. Niethammer. paper
  4. Low-shot visual recognition by shrinking and hallucinating features, in ICCV, 2017. B. Hariharan and R. Girshick. paper code
  5. Improving one-shot learning through fusing side information, arXiv preprint, 2017. Y.H.Tsai and R.Salakhutdinov. paper
  6. Fast parameter adaptation for few-shot image captioning and visual question answering, in ACM MM, 2018. X. Dong, L. Zhu, D. Zhang, Y. Yang, and F. Wu. paper
  7. Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning, in CVPR, 2018. Y. Wu, Y. Lin, X. Dong, Y. Yan, W. Ouyang, and Y. Yang. paper
  8. Low-shot learning with large-scale diffusion, in CVPR, 2018. M. Douze, A. Szlam, B. Hariharan, and H. Jégou. paper
  9. Diverse few-shot text classification with multiple metrics, in NAACL-HLT, 2018. M. Yu, X. Guo, J. Yi, S. Chang, S. Potdar, Y. Cheng, G. Tesauro, H. Wang, and B. Zhou. paper code
  10. Delta-encoder: An effective sample synthesis method for few-shot object recognition, in NeurIPS, 2018. E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein. paper
  11. Low-shot learning via covariance-preserving adversarial augmentation networks, in NeurIPS, 2018. H. Gao, Z. Shou, A. Zareian, H. Zhang, and S. Chang. paper
  12. Learning to self-train for semi-supervised few-shot classification, in NeurIPS, 2019. X. Li, Q. Sun, Y. Liu, S. Zheng, Q. Zhou, T.-S. Chua, and B. Schiele. paper
  13. Few-shot learning with global class representations, in ICCV, 2019. A. Li, T. Luo, T. Xiang, W. Huang, and L. Wang. paper
  14. AutoAugment: Learning augmentation policies from data, in CVPR, 2019. E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le. paper
  15. EDA: Easy data augmentation techniques for boosting performance on text classification tasks, in EMNLP and IJCNLP, 2019. J. Wei and K. Zou. paper
  16. LaSO: Label-set operations networks for multi-label few-shot learning, in CVPR, 2019. A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, and A. M. Bronstein. paper code
  17. Image deformation meta-networks for one-shot learning, in CVPR, 2019. Z. Chen, Y. Fu, Y.-X. Wang, L. Ma, W. Liu, and M. Hebert. paper code
  18. Spot and learn: A maximum-entropy patch sampler for few-shot image classification, in CVPR, 2019. W.-H. Chu, Y.-J. Li, J.-C. Chang, and Y.-C. F. Wang. paper
  19. Data augmentation using learned transformations for one-shot medical image segmentation, in CVPR, 2019. A. Zhao, G. Balakrishnan, F. Durand, J. V. Guttag, and A. V. Dalca. paper
  20. Adversarial feature hallucination networks for few-shot learning, in CVPR, 2020. K. Li, Y. Zhang, K. Li, and Y. Fu. paper
  21. Instance credibility inference for few-shot learning, in CVPR, 2020. Y. Wang, C. Xu, C. Liu, L. Zhang, and Y. Fu. paper
  22. Diversity transfer network for few-shot learning, in AAAI, 2020. M. Chen, Y. Fang, X. Wang, H. Luo, Y. Geng, X. Zhang, C. Huang, W. Liu, and B. Wang. paper code
  23. Neural snowball for few-shot relation learning, in AAAI, 2020. T. Gao, X. Han, R. Xie, Z. Liu, F. Lin, L. Lin, and M. Sun. paper code
  24. Associative alignment for few-shot image classification, in ECCV, 2020. A. Afrasiyabi, J. Lalonde, and C. Gagné. paper code
  25. Information maximization for few-shot learning, in NeurIPS, 2020. M. Boudiaf, I. Ziko, J. Rony, J. Dolz, P. Piantanida, and I. B. Ayed. paper code
  26. Self-training for few-shot transfer across extreme task differences, in ICLR, 2021. C. P. Phoo, and B. Hariharan. paper
  27. Free lunch for few-shot learning: Distribution calibration, in ICLR, 2021. S. Yang, L. Liu, and M. Xu. paper code
  28. Parameterless transductive feature re-representation for few-shot learning, in ICML, 2021. W. Cui, and Y. Guo;. paper
  29. Learning intact features by erasing-inpainting for few-shot classification, in AAAI, 2021. J. Li, Z. Wang, and X. Hu. paper
  30. Variational feature disentangling for fine-grained few-shot classification, in ICCV, 2021. J. Xu, H. Le, M. Huang, S. Athar, and D. Samaras. paper
  31. Coarsely-labeled data for better few-shot transfer, in ICCV, 2021. C. P. Phoo, and B. Hariharan. paper
  32. Pseudo-loss confidence metric for semi-supervised few-shot learning, in ICCV, 2021. K. Huang, J. Geng, W. Jiang, X. Deng, and Z. Xu. paper
  33. Iterative label cleaning for transductive and semi-supervised few-shot learning, in ICCV, 2021. M. Lazarou, T. Stathaki, and Y. Avrithis. paper
  34. Meta two-sample testing: Learning kernels for testing with limited data, in NeurIPS, 2021. F. Liu, W. Xu, J. Lu, and D. J. Sutherland. paper
  35. Dynamic distillation network for cross-domain few-shot recognition with unlabeled data, in NeurIPS, 2021. A. Islam, C.-F. Chen, R. Panda, L. Karlinsky, R. Feris, and R. Radke. paper
  36. Towards better understanding and better generalization of low-shot classification in histology images with contrastive learning, in ICLR, 2022. J. Yang, H. Chen, J. Yan, X. Chen, and J. Yao. paper code
  37. FlipDA: Effective and robust data augmentation for few-shot learning, in ACL, 2022. J. Zhou, Y. Zheng, J. Tang, L. Jian, and Z. Yang. paper code
  38. PromDA: Prompt-based data augmentation for low-resource NLU tasks, in ACL, 2022. Y. Wang, C. Xu, Q. Sun, H. Hu, C. Tao, X. Geng, and D. Jiang. paper code
  39. N-shot learning for augmenting task-oriented dialogue state tracking, in Findings of ACL, 2022. I. T. Aksu, Z. Liu, M. Kan, and N. F. Chen. paper
  40. Generating representative samples for few-shot classification, in CVPR, 2022. J. Xu, and H. Le. paper code
  41. Semi-supervised few-shot learning via multi-factor clustering, in CVPR, 2022. J. Ling, L. Liao, M. Yang, and J. Shuai. paper

Model

Multitask Learning

  1. Multi-task transfer methods to improve one-shot learning for multimedia event detection, in BMVC, 2015. W. Yan, J. Yap, and G. Mori. paper
  2. Label efficient learning of transferable representations across domains and tasks, in NeurIPS, 2017. Z. Luo, Y. Zou, J. Hoffman, and L. Fei-Fei. paper
  3. Few-shot adversarial domain adaptation, in NeurIPS, 2017. S. Motiian, Q. Jones, S. Iranmanesh, and G. Doretto. paper
  4. One-shot unsupervised cross domain translation, in NeurIPS, 2018. S. Benaim and L. Wolf. paper
  5. Multi-content GAN for few-shot font style transfer, in CVPR, 2018. S. Azadi, M. Fisher, V. G. Kim, Z. Wang, E. Shechtman, and T. Darrell. paper code
  6. Feature space transfer for data augmentation, in CVPR, 2018. B. Liu, X. Wang, M. Dixit, R. Kwitt, and N. Vasconcelos. paper
  7. Fine-grained visual categorization using meta-learning optimization with sample selection of auxiliary data, in ECCV, 2018. Y. Zhang, H. Tang, and K. Jia. paper
  8. Few-shot charge prediction with discriminative legal attributes, in COLING, 2018. Z. Hu, X. Li, C. Tu, Z. Liu, and M. Sun. paper
  9. Boosting few-shot visual learning with self-supervision, in ICCV, 2019. S. Gidaris, A. Bursuc, N. Komodakis, P. Pérez, and M. Cord. paper
  10. When does self-supervision improve few-shot learning?, in ECCV, 2020. J. Su, S. Maji, and B. Hariharan. paper
  11. Pareto self-supervised training for few-shot learning, in CVPR, 2021. Z. Chen, J. Ge, H. Zhan, S. Huang, and D. Wang. paper
  12. Bridging multi-task learning and meta-learning: Towards efficient training and effective adaptation, in ICML, 2021. H. Wang, H. Zhao, and B. Li;. paper code

Embedding/Metric Learning

  1. Object classification from a single example utilizing class relevance metrics, in NeurIPS, 2005. M. Fink. paper
  2. Optimizing one-shot recognition with micro-set learning, in CVPR, 2010. K. D. Tang, M. F. Tappen, R. Sukthankar, and C. H. Lampert. paper
  3. Siamese neural networks for one-shot image recognition, ICML deep learning workshop, 2015. G. Koch, R. Zemel, and R. Salakhutdinov. paper
  4. Matching networks for one shot learning, in NeurIPS, 2016. O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra et al. paper
  5. Learning feed-forward one-shot learners, in NeurIPS, 2016. L. Bertinetto, J. F. Henriques, J. Valmadre, P. Torr, and A. Vedaldi. paper
  6. Few-shot learning through an information retrieval lens, in NeurIPS, 2017. E. Triantafillou, R. Zemel, and R. Urtasun. paper
  7. Prototypical networks for few-shot learning, in NeurIPS, 2017. J. Snell, K. Swersky, and R. S. Zemel. paper code
  8. Attentive recurrent comparators, in ICML, 2017. P. Shyam, S. Gupta, and A. Dukkipati. paper
  9. Learning algorithms for active learning, in ICML, 2017. P. Bachman, A. Sordoni, and A. Trischler. paper
  10. Active one-shot learning, arXiv preprint, 2017. M. Woodward and C. Finn. paper
  11. Structured set matching networks for one-shot part labeling, in CVPR, 2018. J. Choi, J. Krishnamurthy, A. Kembhavi, and A. Farhadi. paper
  12. Low-shot learning from imaginary data, in CVPR, 2018. Y.-X. Wang, R. Girshick, M. Hebert, and B. Hariharan. paper
  13. Learning to compare: Relation network for few-shot learning, in CVPR, 2018. F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales. paper code
  14. Dynamic conditional networks for few-shot learning, in ECCV, 2018. F. Zhao, J. Zhao, S. Yan, and J. Feng. paper code
  15. TADAM: Task dependent adaptive metric for improved few-shot learning, in NeurIPS, 2018. B. Oreshkin, P. R. López, and A. Lacoste. paper
  16. Meta-learning for semi-supervised few-shot classification, in ICLR, 2018. M. Ren, S. Ravi, E. Triantafillou, J. Snell, K. Swersky, J. B. Tenen- baum, H. Larochelle, and R. S. Zemel. paper code
  17. Few-shot learning with graph neural networks, in ICLR, 2018. V. G. Satorras and J. B. Estrach. paper code
  18. A simple neural attentive meta-learner, in ICLR, 2018. N. Mishra, M. Rohaninejad, X. Chen, and P. Abbeel. paper
  19. Meta-learning with differentiable closed-form solvers, in ICLR, 2019. L. Bertinetto, J. F. Henriques, P. Torr, and A. Vedaldi. paper
  20. Learning to propagate labels: Transductive propagation network for few-shot learning, in ICLR, 2019. Y. Liu, J. Lee, M. Park, S. Kim, E. Yang, S. Hwang, and Y. Yang. paper code
  21. Multi-level matching and aggregation network for few-shot relation classification, in ACL, 2019. Z.-X. Ye, and Z.-H. Ling. paper
  22. Induction networks for few-shot text classification, in EMNLP-IJCNLP, 2019. R. Geng, B. Li, Y. Li, X. Zhu, P. Jian, and J. Sun. paper
  23. Hierarchical attention prototypical networks for few-shot text classification, in EMNLP-IJCNLP, 2019. S. Sun, Q. Sun, K. Zhou, and T. Lv. paper
  24. Cross attention network for few-shot classification, in NeurIPS, 2019. R. Hou, H. Chang, B. Ma, S. Shan, and X. Chen. paper
  25. Hybrid attention-based prototypical networks for noisy few-shot relation classification, in AAAI, 2019. T. Gao, X. Han, Z. Liu, and M. Sun. paper code
  26. Attention-based multi-context guiding for few-shot semantic segmentation, in AAAI, 2019. T. Hu, P. Yang, C. Zhang, G. Yu, Y. Mu and C. G. M. Snoek. paper
  27. Distribution consistency based covariance metric networks for few-shot learning, in AAAI, 2019. W. Li, L. Wang, J. Xu, J. Huo, Y. Gao and J. Luo. paper
  28. A dual attention network with semantic embedding for few-shot learning, in AAAI, 2019. S. Yan, S. Zhang, and X. He. paper
  29. TapNet: Neural network augmented with task-adaptive projection for few-shot learning, in ICML, 2019. S. W. Yoon, J. Seo, and J. Moon. paper
  30. Prototype propagation networks (PPN) for weakly-supervised few-shot learning on category graph, in IJCAI, 2019. L. Liu, T. Zhou, G. Long, J. Jiang, L. Yao, C. Zhang. paper code
  31. Collect and select: Semantic alignment metric learning for few-shot learning, in ICCV, 2019. F. Hao, F. He, J. Cheng, L. Wang, J. Cao, and D. Tao. paper
  32. Transductive episodic-wise adaptive metric for few-shot learning, in ICCV, 2019. L. Qiao, Y. Shi, J. Li, Y. Wang, T. Huang, and Y. Tian. paper
  33. Few-shot learning with embedded class models and shot-free meta training, in ICCV, 2019. A. Ravichandran, R. Bhotika, and S. Soatto. paper
  34. PARN: Position-aware relation networks for few-shot learning, in ICCV, 2019. Z. Wu, Y. Li, L. Guo, and K. Jia. paper
  35. PANet: Few-shot image semantic segmentation with prototype alignment, in ICCV, 2019. K. Wang, J. H. Liew, Y. Zou, D. Zhou, and J. Feng. paper code
  36. RepMet: Representative-based metric learning for classification and few-shot object detection, in CVPR, 2019. L. Karlinsky, J. Shtok, S. Harary, E. Schwartz, A. Aides, R. Feris, R. Giryes, and A. M. Bronstein. paper code
  37. Edge-labeling graph neural network for few-shot learning, in CVPR, 2019. J. Kim, T. Kim, S. Kim, and C. D. Yoo. paper
  38. Finding task-relevant features for few-shot learning by category traversal, in CVPR, 2019. H. Li, D. Eigen, S. Dodge, M. Zeiler, and X. Wang. paper code
  39. Revisiting local descriptor based image-to-class measure for few-shot learning, in CVPR, 2019. W. Li, L. Wang, J. Xu, J. Huo, Y. Gao, and J. Luo. paper code
  40. TAFE-Net: Task-aware feature embeddings for low shot learning, in CVPR, 2019. X. Wang, F. Yu, R. Wang, T. Darrell, and J. E. Gonzalez. paper code
  41. Improved few-shot visual classification, in CVPR, 2020. P. Bateni, R. Goyal, V. Masrani, F. Wood, and L. Sigal. paper
  42. Boosting few-shot learning with adaptive margin loss, in CVPR, 2020. A. Li, W. Huang, X. Lan, J. Feng, Z. Li, and L. Wang. paper
  43. Adaptive subspaces for few-shot learning, in CVPR, 2020. C. Simon, P. Koniusz, R. Nock, and M. Harandi. paper
  44. DPGN: Distribution propagation graph network for few-shot learning, in CVPR, 2020. L. Yang, L. Li, Z. Zhang, X. Zhou, E. Zhou, and Y. Liu. paper
  45. Few-shot learning via embedding adaptation with set-to-set functions, in CVPR, 2020. H.-J. Ye, H. Hu, D.-C. Zhan, and F. Sha. paper code
  46. DeepEMD: Few-shot image classification with differentiable earth mover’s distance and structured classifiers, in CVPR, 2020. C. Zhang, Y. Cai, G. Lin, and C. Shen. paper code
  47. Few-shot text classification with distributional signatures, in ICLR, 2020. Y. Bao, M. Wu, S. Chang, and R. Barzilay. paper code
  48. Learning task-aware local representations for few-shot learning, in IJCAI, 2020. C. Dong, W. Li, J. Huo, Z. Gu, and Y. Gao. paper
  49. SimPropNet: Improved similarity propagation for few-shot image segmentation, in IJCAI, 2020. S. Gairola, M. Hemani, A. Chopra, and B. Krishnamurthy. paper
  50. Asymmetric distribution measure for few-shot learning, in IJCAI, 2020. W. Li, L. Wang, J. Huo, Y. Shi, Y. Gao, and J. Luo. paper
  51. Transductive relation-propagation network for few-shot learning, in IJCAI, 2020. Y. Ma, S. Bai, S. An, W. Liu, A. Liu, X. Zhen, and X. Liu. paper
  52. Weakly supervised few-shot object segmentation using co-attention with visual and semantic embeddings, in IJCAI, 2020. M. Siam, N. Doraiswamy, B. N. Oreshkin, H. Yao, and M. Jägersand. paper
  53. Few-shot learning on graphs via super-classes based on graph spectral measures, in ICLR, 2020. J. Chauhan, D. Nathani, and M. Kaul. paper
  54. SGAP-Net: Semantic-guided attentive prototypes network for few-shot human-object interaction recognition, in AAAI, 2020. Z. Ji, X. Liu, Y. Pang, and X. Li. paper
  55. One-shot image classification by learning to restore prototypes, in AAAI, 2020. W. Xue, and W. Wang. paper
  56. Negative margin matters: Understanding margin in few-shot classification, in ECCV, 2020. B. Liu, Y. Cao, Y. Lin, Q. Li, Z. Zhang, M. Long, and H. Hu. paper code
  57. Prototype rectification for few-shot learning, in ECCV, 2020. J. Liu, L. Song, and Y. Qin. paper
  58. Rethinking few-shot image classification: A good embedding is all you need?, in ECCV, 2020. Y. Tian, Y. Wang, D. Krishnan, J. B. Tenenbaum, and P. Isola. paper code
  59. SEN: A novel feature normalization dissimilarity measure for prototypical few-shot learning networks, in ECCV, 2020. V. N. Nguyen, S. Løkse, K. Wickstrøm, M. Kampffmeyer, D. Roverso, and R. Jenssen. paper
  60. TAFSSL: Task-adaptive feature sub-space learning for few-shot classification, in ECCV, 2020. M. Lichtenstein, P. Sattigeri, R. Feris, R. Giryes, and L. Karlinsky. paper
  61. Attentive prototype few-shot learning with capsule network-based embedding, in ECCV, 2020. F. Wu, J. S.Smith, W. Lu, C. Pang, and B. Zhang. paper
  62. Embedding propagation: Smoother manifold for few-shot classification, in ECCV, 2020. P. Rodríguez, I. Laradji, A. Drouin, and A. Lacoste. paper code
  63. Laplacian regularized few-shot learning, in ICML, 2020. I. M. Ziko, J. Dolz, E. Granger, and I. B. Ayed. paper code
  64. TAdaNet: Task-adaptive network for graph-enriched meta-learning, in KDD, 2020. Q. Suo, i. Chou, W. Zhong, and A. Zhang. paper
  65. Concept learners for few-shot learning, in ICLR, 2021. K. Cao, M. Brbic, and J. Leskovec. paper
  66. Reinforced attention for few-shot learning and beyond, in CVPR, 2021. J. Hong, P. Fang, W. Li, T. Zhang, C. Simon, M. Harandi, and L. Petersson. paper
  67. Mutual CRF-GNN for few-shot learning, in CVPR, 2021. S. Tang, D. Chen, L. Bai, K. Liu, Y. Ge, and W. Ouyang. paper
  68. Few-shot classification with feature map reconstruction networks, in CVPR, 2021. D. Wertheimer, L. Tang, and B. Hariharan. paper code
  69. ECKPN: Explicit class knowledge propagation network for transductive few-shot learning, in CVPR, 2021. C. Chen, X. Yang, C. Xu, X. Huang, and Z. Ma. paper
  70. Exploring complementary strengths of invariant and equivariant representations for few-shot learning, in CVPR, 2021. M. N. Rizve, S. Khan, F. S. Khan, and M. Shah. paper
  71. Rethinking class relations: Absolute-relative supervised and unsupervised few-shot learning, in CVPR, 2021. H. Zhang, P. Koniusz, S. Jian, H. Li, and P. H. S. Torr. paper
  72. Unsupervised embedding adaptation via early-stage feature reconstruction for few-shot classification, in ICML, 2021. D. H. Lee, and S. Chung. paper code
  73. Learning a few-shot embedding model with contrastive learning, in AAAI, 2021. C. Liu, Y. Fu, C. Xu, S. Yang, J. Li, C. Wang, and L. Zhang. paper
  74. Looking wider for better adaptive representation in few-shot learning, in AAAI, 2021. J. Zhao, Y. Yang, X. Lin, J. Yang, and L. He. paper
  75. Tailoring embedding function to heterogeneous few-shot tasks by global and local feature adaptors, in AAAI, 2021. S. Lu, H. Ye, and D.-C. Zhan. paper
  76. Knowledge guided metric learning for few-shot text classification, in NAACL-HLT, 2021. D. Sui, Y. Chen, B. Mao, D. Qiu, K. Liu, and J. Zhao. paper
  77. Mixture-based feature space learning for few-shot image classification, in ICCV, 2021. A. Afrasiyabi, J. Lalonde, and C. Gagné. paper
  78. Z-score normalization, hubness, and few-shot learning, in ICCV, 2021. N. Fei, Y. Gao, Z. Lu, and T. Xiang. paper
  79. Relational embedding for few-shot classification, in ICCV, 2021. D. Kang, H. Kwon, J. Min, and M. Cho. paper code
  80. Transductive few-shot classification on the oblique manifold, in ICCV, 2021. G. Qi, H. Yu, Z. Lu, and S. Li. paper code
  81. Curvature generation in curved spaces for few-shot learning, in ICCV, 2021. Z. Gao, Y. Wu, Y. Jia, and M. Harandi. paper
  82. On episodes, prototypical networks, and few-shot learning, in NeurIPS, 2021. S. Laenen, and L. Bertinetto. paper
  83. Few-shot learning as cluster-induced voronoi diagrams: A geometric approach, in ICLR, 2022. C. Ma, Z. Huang, M. Gao, and J. Xu. paper code
  84. Few-shot learning with siamese networks and label tuning, in ACL, 2022. T. Müller, G. Pérez-Torró, and M. Franco-Salvador. paper code
  85. Learning to affiliate: Mutual centralized learning for few-shot classification, in CVPR, 2022. Y. Liu, W. Zhang, C. Xiang, T. Zheng, D. Cai, and X. He. paper
  86. Matching feature sets for few-shot image classification, in CVPR, 2022. A. Afrasiyabi, H. Larochelle, J. Lalonde, and C. Gagné. paper code
  87. Joint distribution matters: Deep Brownian distance covariance for few-shot classification, in CVPR, 2022. J. Xie, F. Long, J. Lv, Q. Wang, and P. Li. paper
  88. CAD: Co-adapting discriminative features for improved few-shot classification, in CVPR, 2022. P. Chikontwe, S. Kim, and S. H. Park. paper
  89. Ranking distance calibration for cross-domain few-shot learning, in CVPR, 2022. P. Li, S. Gong, C. Wang, and Y. Fu. paper
  90. EASE: Unsupervised discriminant subspace learning for transductive few-shot learning, in CVPR, 2022. H. Zhu, and P. Koniusz. paper code
  91. Cross-domain few-shot learning with task-specific adapters, in CVPR, 2022. W. Li, X. Liu, and H. Bilen. paper code

Learning with External Memory

  1. Meta-learning with memory-augmented neural networks, in ICML, 2016. A. Santoro, S. Bartunov, M. Botvinick, D. Wierstra, and T. Lillicrap. paper
  2. Few-shot object recognition from machine-labeled web images, in CVPR, 2017. Z. Xu, L. Zhu, and Y. Yang. paper
  3. Learning to remember rare events, in ICLR, 2017. Ł. Kaiser, O. Nachum, A. Roy, and S. Bengio. paper
  4. Meta networks, in ICML, 2017. T. Munkhdalai and H. Yu. paper
  5. Memory matching networks for one-shot image recognition, in CVPR, 2018. Q. Cai, Y. Pan, T. Yao, C. Yan, and T. Mei. paper
  6. Compound memory networks for few-shot video classification, in ECCV, 2018. L. Zhu and Y. Yang. paper
  7. Memory, show the way: Memory based few shot word representation learning, in EMNLP, 2018. J. Sun, S. Wang, and C. Zong. paper
  8. Rapid adaptation with conditionally shifted neurons, in ICML, 2018. T. Munkhdalai, X. Yuan, S. Mehri, and A. Trischler. paper
  9. Adaptive posterior learning: Few-shot learning with a surprise-based memory module, in ICLR, 2019. T. Ramalho and M. Garnelo. paper code
  10. Coloring with limited data: Few-shot colorization via memory augmented networks, in CVPR, 2019. S. Yoo, H. Bahng, S. Chung, J. Lee, J. Chang, and J. Choo. paper
  11. ACMM: Aligned cross-modal memory for few-shot image and sentence matching, in ICCV, 2019. Y. Huang, and L. Wang. paper
  12. Dynamic memory induction networks for few-shot text classification, in ACL, 2020. R. Geng, B. Li, Y. Li, J. Sun, and X. Zhu. paper
  13. Few-shot visual learning with contextual memory and fine-grained calibration, in IJCAI, 2020. Y. Ma, W. Liu, S. Bai, Q. Zhang, A. Liu, W. Chen, and X. Liu. paper
  14. Learn from concepts: Towards the purified memory for few-shot learning, in IJCAI, 2021. X. Liu, X. Tian, S. Lin, Y. Qu, L. Ma, W. Yuan, Z. Zhang, and Y. Xie. paper
  15. Prototype memory and attention mechanisms for few shot image generation, in ICLR, 2022. T. Li, Z. Li, A. Luo, H. Rockwell, A. B. Farimani, and T. S. Lee. paper code
  16. Hierarchical variational memory for few-shot learning across domains, in ICLR, 2022. Y. Du, X. Zhen, L. Shao, and C. G. M. Snoek. paper code
  17. Remember the difference: Cross-domain few-shot semantic segmentation via meta-memory transfer, in CVPR, 2022. W. Wang, L. Duan, Y. Wang, Q. En, J. Fan, and Z. Zhang. paper

Generative Modeling

  1. One-shot learning of object categories, TPAMI, 2006. L. Fei-Fei, R. Fergus, and P. Perona. paper
  2. Learning to learn with compound HD models, in NeurIPS, 2011. A. Torralba, J. B. Tenenbaum, and R. R. Salakhutdinov. paper
  3. One-shot learning with a hierarchical nonparametric bayesian model, in ICML Workshop on Unsupervised and Transfer Learning, 2012. R. Salakhutdinov, J. Tenenbaum, and A. Torralba. paper
  4. Human-level concept learning through probabilistic program induction, Science, 2015. B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum. paper
  5. One-shot generalization in deep generative models, in ICML, 2016. D. Rezende, I. Danihelka, K. Gregor, and D. Wierstra. paper
  6. One-shot video object segmentation, in CVPR, 2017. S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixé, D. Cremers, and L. Van Gool. paper
  7. Towards a neural statistician, in ICLR, 2017. H. Edwards and A. Storkey. paper
  8. Extending a parser to distant domains using a few dozen partially annotated examples, in ACL, 2018. V. Joshi, M. Peters, and M. Hopkins. paper
  9. MetaGAN: An adversarial approach to few-shot learning, in NeurIPS, 2018. R. Zhang, T. Che, Z. Ghahramani, Y. Bengio, and Y. Song. paper
  10. Few-shot autoregressive density estimation: Towards learning to learn distributions, in ICLR, 2018. S. Reed, Y. Chen, T. Paine, A. van den Oord, S. M. A. Eslami, D. Rezende, O. Vinyals, and N. de Freitas. paper
  11. The variational homoencoder: Learning to learn high capacity generative models from few examples, in UAI, 2018. L. B. Hewitt, M. I. Nye, A. Gane, T. Jaakkola, and J. B. Tenenbaum. paper
  12. Meta-learning probabilistic inference for prediction, in ICLR, 2019. J. Gordon, J. Bronskill, M. Bauer, S. Nowozin, and R. Turner. paper
  13. Variational prototyping-encoder: One-shot learning with prototypical images, in CVPR, 2019. J. Kim, T.-H. Oh, S. Lee, F. Pan, and I. S. Kweon. paper code
  14. Variational few-shot learning, in ICCV, 2019. J. Zhang, C. Zhao, B. Ni, M. Xu, and X. Yang. paper
  15. Infinite mixture prototypes for few-shot learning, in ICML, 2019. K. Allen, E. Shelhamer, H. Shin, and J. Tenenbaum. paper
  16. Dual variational generation for low shot heterogeneous face recognition, in NeurIPS, 2019. C. Fu, X. Wu, Y. Hu, H. Huang, and R. He. paper
  17. Bayesian meta sampling for fast uncertainty adaptation, in ICLR, 2020. Z. Wang, Y. Zhao, P. Yu, R. Zhang, and C. Chen. paper
  18. Empirical Bayes transductive meta-learning with synthetic gradients, in ICLR, 2020. S. X. Hu, P. G. Moreno, Y. Xiao, X. Shen, G. Obozinski, N. D. Lawrence, and A. C. Damianou. paper
  19. Few-shot relation extraction via bayesian meta-learning on relation graphs, in ICML, 2020. M. Qu, T. Gao, L. A. C. Xhonneux, and J. Tang. paper code
  20. Interventional few-shot learning, in NeurIPS, 2020. Z. Yue, H. Zhang, Q. Sun, and X. Hua. paper code
  21. Bayesian few-shot classification with one-vs-each pólya-gamma augmented gaussian processes, in ICLR, 2021. J. Snell, and R. Zemel. paper
  22. Few-shot Bayesian optimization with deep kernel surrogates, in ICLR, 2021. M. Wistuba, and J. Grabocka. paper
  23. Modeling the probabilistic distribution of unlabeled data for one-shot medical image segmentation, in AAAI, 2021. Y. Ding, X. Yu, and Y. Yang. paper code
  24. A hierarchical transformation-discriminating generative model for few shot anomaly detection, in ICCV, 2021. S. Sheynin, S. Benaim, and L. Wolf. paper
  25. Reinforced few-shot acquisition function learning for Bayesian optimization, in NeurIPS, 2021. B. Hsieh, P. Hsieh, and X. Liu. paper
  26. GanOrCon: Are generative models useful for few-shot segmentation?, in CVPR, 2022. O. Saha, Z. Cheng, and S. Maji. paper
  27. Few shot generative model adaption via relaxed spatial structural alignment, in CVPR, 2022. J. Xiao, L. Li, C. Wang, Z. Zha, and Q. Huang. paper

Algorithm

Refining Existing Parameters

  1. Cross-generalization: Learning novel classes from a single example by feature replacement, in CVPR, 2005. E. Bart and S. Ullman. paper
  2. One-shot adaptation of supervised deep convolutional models, in ICLR, 2013. J. Hoffman, E. Tzeng, J. Donahue, Y. Jia, K. Saenko, and T. Darrell. paper
  3. Learning to learn: Model regression networks for easy small sample learning, in ECCV, 2016. Y.-X. Wang and M. Hebert. paper
  4. Learning from small sample sets by combining unsupervised meta-training with CNNs, in NeurIPS, 2016. Y.-X. Wang and M. Hebert. paper
  5. Efficient k-shot learning with regularized deep networks, in AAAI, 2018. D. Yoo, H. Fan, V. N. Boddeti, and K. M. Kitani. paper
  6. CLEAR: Cumulative learning for one-shot one-class image recognition, in CVPR, 2018. J. Kozerawski and M. Turk. paper
  7. Learning structure and strength of CNN filters for small sample size training, in CVPR, 2018. R. Keshari, M. Vatsa, R. Singh, and A. Noore. paper
  8. Dynamic few-shot visual learning without forgetting, in CVPR, 2018. S. Gidaris and N. Komodakis. paper code
  9. Low-shot learning with imprinted weights, in CVPR, 2018. H. Qi, M. Brown, and D. G. Lowe. paper
  10. Neural voice cloning with a few samples, in NeurIPS, 2018. S. Arik, J. Chen, K. Peng, W. Ping, and Y. Zhou. paper
  11. Text classification with few examples using controlled generalization, in NAACL-HLT, 2019. A. Mahabal, J. Baldridge, B. K. Ayan, V. Perot, and D. Roth. paper
  12. Low shot box correction for weakly supervised object detection, in IJCAI, 2019. T. Pan, B. Wang, G. Ding, J. Han, and J. Yong. paper
  13. Diversity with cooperation: Ensemble methods for few-shot classification, in ICCV, 2019. N. Dvornik, C. Schmid, and J. Mairal. paper
  14. Few-shot image recognition with knowledge transfer, in ICCV, 2019. Z. Peng, Z. Li, J. Zhang, Y. Li, G.-J. Qi, and J. Tang. paper
  15. Generating classification weights with gnn denoising autoencoders for few-shot learning, in CVPR, 2019. S. Gidaris, and N. Komodakis. paper code
  16. Dense classification and implanting for few-shot learning, in CVPR, 2019. Y. Lifchitz, Y. Avrithis, S. Picard, and A. Bursuc. paper
  17. Few-shot adaptive faster R-CNN, in CVPR, 2019. T. Wang, X. Zhang, L. Yuan, and J. Feng. paper
  18. TransMatch: A transfer-learning scheme for semi-supervised few-shot learning, in CVPR, 2020. Z. Yu, L. Chen, Z. Cheng, and J. Luo. paper
  19. Learning to select base classes for few-shot classification, in CVPR, 2020. L. Zhou, P. Cui, X. Jia, S. Yang, and Q. Tian. paper
  20. Few-shot NLG with pre-trained language model, in ACL, 2020. Z. Chen, H. Eavani, W. Chen, Y. Liu, and W. Y. Wang. paper code
  21. Span-ConveRT: Few-shot span extraction for dialog with pretrained conversational representations, in ACL, 2020. S. Coope, T. Farghly, D. Gerz, I. Vulic, and M. Henderson. paper
  22. Structural supervision improves few-shot learning and syntactic generalization in neural language models, in EMNLP, 2020. E. Wilcox, P. Qian, R. Futrell, R. Kohita, R. Levy, and M. Ballesteros. paper code
  23. A baseline for few-shot image classification, in ICLR, 2020. G. S. Dhillon, P. Chaudhari, A. Ravichandran, and S. Soatto. paper
  24. Cross-domain few-shot classification via learned feature-wise transformation, in ICLR, 2020. H. Tseng, H. Lee, J. Huang, and M. Yang. paper code
  25. Graph few-shot learning via knowledge transfer, in AAAI, 2020. H. Yao, C. Zhang, Y. Wei, M. Jiang, S. Wang, J. Huang, N. V. Chawla, and Z. Li. paper
  26. Knowledge graph transfer network for few-shot recognition, in AAAI, 2020. R. Chen, T. Chen, X. Hui, H. Wu, G. Li, and L. Lin. paper
  27. Context-Transformer: Tackling object confusion for few-shot detection, in AAAI, 2020. Z. Yang, Y. Wang, X. Chen, J. Liu, and Y. Qiao. paper
  28. A broader study of cross-domain few-shot learning, in ECCV, 2020. Y. Guo, N. C. Codella, L. Karlinsky, J. V. Codella, J. R. Smith, K. Saenko, T. Rosing, and R. Feris. paper code
  29. Selecting relevant features from a multi-domain representation for few-shot classification, in ECCV, 2020. N. Dvornik, C. Schmid, and J. Mairal. paper code
  30. Prototype completion with primitive knowledge for few-shot learning, in CVPR, 2021. B. Zhang, X. Li, Y. Ye, Z. Huang, and L. Zhang. paper code
  31. Partial is better than all: Revisiting fine-tuning strategy for few-shot learning, in AAAI, 2021. Z. Shen, Z. Liu, J. Qin, M. Savvides, and K.-T. Cheng. paper
  32. PTN: A poisson transfer network for semi-supervised few-shot learning, in AAAI, 2021. H. Huang, J. Zhang, J. Zhang, Q. Wu, and C. Xu. paper
  33. A universal representation transformer layer for few-shot image classification, in ICLR, 2021. L. Liu, W. L. Hamilton, G. Long, J. Jiang, and H. Larochelle. paper
  34. Making pre-trained language models better few-shot learners, in ACL-IJCNLP, 2021. T. Gao, A. Fisch, and D. Chen. paper code
  35. Self-supervised network evolution for few-shot classification, in IJCAI, 2021. X. Tang, Z. Teng, B. Zhang, and J. Fan. paper
  36. Calibrate before use: Improving few-shot performance of language models, in ICML, 2021. Z. Zhao, E. Wallace, S. Feng, D. Klein, and S. Singh. paper code
  37. Language models are few-shot learners, in NeurIPS, 2020. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. paper
  38. It’s not just size that matters: Small language models are also few-shot learners, in NAACL-HLT, 2021. T. Schick, and H. Schütze. paper code
  39. Self-training improves pre-training for few-shot learning in task-oriented dialog systems, in EMNLP, 2021. F. Mi, W. Zhou, L. Kong, F. Cai, M. Huang, and B. Faltings. paper
  40. Few-shot intent detection via contrastive pre-training and fine-tuning, in EMNLP, 2021. J. Zhang, T. Bui, S. Yoon, X. Chen, Z. Liu, C. Xia, Q. H. Tran, W. Chang, and P. S. Yu. paper code
  41. Avoiding inference heuristics in few-shot prompt-based finetuning, in EMNLP, 2021. P. A. Utama, N. S. Moosavi, V. Sanh, and I. Gurevych. paper code
  42. Constrained language models yield few-shot semantic parsers, in EMNLP, 2021. R. Shin, C. H. Lin, S. Thomson, C. Chen, S. Roy, E. A. Platanios, A. Pauls, D. Klein, J. Eisner, and B. V. Durme. paper code
  43. Revisiting self-training for few-shot learning of language model, in EMNLP, 2021. Y. Chen, Y. Zhang, C. Zhang, G. Lee, R. Cheng, and H. Li. paper code
  44. Language models are few-shot butlers, in EMNLP, 2021. V. Micheli, and F. Fleuret. paper code
  45. FewshotQA: A simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models, in EMNLP, 2021. R. Chada, and P. Natarajan. paper
  46. TransPrompt: Towards an automatic transferable prompting framework for few-shot text classification, in EMNLP, 2021. C. Wang, J. Wang, M. Qiu, J. Huang, and M. Gao. paper
  47. Meta distant transfer learning for pre-trained language models, in EMNLP, 2021. C. Wang, H. Pan, M. Qiu, J. Huang, F. Yang, and Y. Zhang. paper
  48. STraTA: Self-training with task augmentation for better few-shot learning, in EMNLP, 2021. T. Vu, M. Luong, Q. V. Le, G. Simon, and M. Iyyer. paper code
  49. Few-shot image classification: Just use a library of pre-trained feature extractors and a simple classifier, in ICCV, 2021. A. Chowdhury, M. Jiang, S. Chaudhuri, and C. Jermaine. paper code
  50. On the importance of distractors for few-shot classification, in ICCV, 2021. R. Das, Y. Wang, and J. M. F. Moura. paper code
  51. A multi-mode modulator for multi-domain few-shot classification, in ICCV, 2021. Y. Liu, J. Lee, L. Zhu, L. Chen, H. Shi, and Y. Yang. paper
  52. Universal representation learning from multiple domains for few-shot classification, in ICCV, 2021. W. Li, X. Liu, and H. Bilen. paper code
  53. Boosting the generalization capability in cross-domain few-shot learning via noise-enhanced supervised autoencoder, in ICCV, 2021. H. Liang, Q. Zhang, P. Dai, and J. Lu. paper
  54. How fine-tuning allows for effective meta-learning, in NeurIPS, 2021. K. Chua, Q. Lei, and J. D. Lee. paper
  55. Multimodal few-shot learning with frozen language models, in NeurIPS, 2021. M. Tsimpoukelli, J. Menick, S. Cabi, S. M. A. Eslami, O. Vinyals, and F. Hill. paper
  56. Grad2Task: Improved few-shot text classification using gradients for task representation, in NeurIPS, 2021. J. Wang, K. Wang, F. Rudzicz, and M. Brudno. paper
  57. True few-shot learning with language models, in NeurIPS, 2021. E. Perez, D. Kiela, and K. Cho. paper
  58. POODLE: Improving few-shot learning via penalizing out-of-distribution samples, in NeurIPS, 2021. D. Le, K. Nguyen, Q. Tran, R. Nguyen, and B. Hua. paper
  59. TOHAN: A one-step approach towards few-shot hypothesis adaptation, in NeurIPS, 2021. H. Chi, F. Liu, W. Yang, L. Lan, T. Liu, B. Han, W. Cheung, and J. Kwok. paper
  60. Task affinity with maximum bipartite matching in few-shot learning, in ICLR, 2022. C. P. Le, J. Dong, M. Soltani, and V. Tarokh. paper
  61. Differentiable prompt makes pre-trained language models better few-shot learners, in ICLR, 2022. N. Zhang, L. Li, X. Chen, S. Deng, Z. Bi, C. Tan, F. Huang, and H. Chen. paper code
  62. ConFeSS: A framework for single source cross-domain few-shot learning, in ICLR, 2022. D. Das, S. Yun, and F. Porikli. paper
  63. Switch to generalize: Domain-switch learning for cross-domain few-shot classification, in ICLR, 2022. Z. Hu, Y. Sun, and Y. Yang. paper
  64. LM-BFF-MS: Improving few-shot fine-tuning of language models based on multiple soft demonstration memory, in ACL, 2022. E. Park, D. H. Jeon, S. Kim, I. Kang, and S. Na. paper code
  65. Meta-learning via language model in-context tuning, in ACL, 2022. Y. Chen, R. Zhong, S. Zha, G. Karypis, and H. He. paper code
  66. Few-shot tabular data enrichment using fine-tuned transformer architectures, in ACL, 2022. A. Harari, and G. Katz. paper
  67. Noisy channel language model prompting for few-shot text classification, in ACL, 2022. S. Min, M. Lewis, H. Hajishirzi, and L. Zettlemoyer. paper code
  68. Prompt for extraction? PAIE: Prompting argument interaction for event argument extraction, in ACL, 2022. Y. Ma, Z. Wang, Y. Cao, M. Li, M. Chen, K. Wang, and J. Shao. paper code
  69. Are prompt-based models clueless?, in ACL, 2022. P. Kavumba, R. Takahashi, and Y. Oda. paper
  70. Prototypical verbalizer for prompt-based few-shot tuning, in ACL, 2022. G. Cui, S. Hu, N. Ding, L. Huang, and Z. Liu. paper code
  71. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity, in ACL, 2022. Y. Lu, M. Bartolo, A. Moore, S. Riedel, and P. Stenetorp. paper
  72. PPT: Pre-trained prompt tuning for few-shot learning, in ACL, 2022. Y. Gu, X. Han, Z. Liu, and M. Huang. paper code
  73. ASCM: An answer space clustered prompting method without answer engineering, in Findings of ACL, 2022. Z. Wang, Y. Yang, Z. Xi, B. Ma, L. Wang, R. Dong, and A. Anwar. paper code
  74. Exploiting language model prompts using similarity measures: A case study on the word-in-context task, in ACL, 2022. M. Tabasi, K. Rezaee, and M. T. Pilehvar. paper
  75. P-Tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks, in ACL, 2022. X. Liu, K. Ji, Y. Fu, W. Tam, Z. Du, Z. Yang, and J. Tang. paper
  76. Cutting down on prompts and parameters: Simple few-shot learning with language models, in Findings of ACL, 2022. R. L. L. IV, I. Balazevic, E. Wallace, F. Petroni, S. Singh, and S. Riedel. paper code
  77. Prompt-free and efficient few-shot learning with language models, in ACL, 2022. R. K. Mahabadi, L. Zettlemoyer, J. Henderson, L. Mathias, M. Saeidi, V. Stoyanov, and M. Yazdani. paper code
  78. Pre-training to match for unified low-shot relation extraction, in ACL, 2022. F. Liu, H. Lin, X. Han, B. Cao, and L. Sun. paper code
  79. Dual context-guided continuous prompt tuning for few-shot learning, in Findings of ACL, 2022. J. Zhou, L. Tian, H. Yu, Z. Xiao, H. Su, and J. Zhou. paper
  80. Cluster & tune: Boost cold start performance in text classification, in ACL, 2022. E. Shnarch, A. Gera, A. Halfon, L. Dankin, L. Choshen, R. Aharonov, and N. Slonim. paper code
  81. Pushing the limits of simple pipelines for few-shot learning: External data and fine-tuning make a difference, in CVPR, 2022. S. X. Hu, D. Li, J. Stühmer, M. Kim, and T. M. Hospedales. paper code

Refining Meta-learned Parameters

  1. Model-agnostic meta-learning for fast adaptation of deep networks, in ICML, 2017. C. Finn, P. Abbeel, and S. Levine. paper
  2. Bayesian model-agnostic meta-learning, in NeurIPS, 2018. J. Yoon, T. Kim, O. Dia, S. Kim, Y. Bengio, and S. Ahn. paper
  3. Probabilistic model-agnostic meta-learning, in NeurIPS, 2018. C. Finn, K. Xu, and S. Levine. paper
  4. Gradient-based meta-learning with learned layerwise metric and subspace, in ICML, 2018. Y. Lee and S. Choi. paper
  5. Recasting gradient-based meta-learning as hierarchical Bayes, in ICLR, 2018. E. Grant, C. Finn, S. Levine, T. Darrell, and T. Griffiths. paper
  6. Few-shot human motion prediction via meta-learning, in ECCV, 2018. L.-Y. Gui, Y.-X. Wang, D. Ramanan, and J. Moura. paper
  7. The effects of negative adaptation in model-agnostic meta-learning, arXiv preprint, 2018. T. Deleu and Y. Bengio. paper
  8. Unsupervised meta-learning for few-shot image classification, in NeurIPS, 2019. S. Khodadadeh, L. Bölöni, and M. Shah. paper
  9. Amortized bayesian meta-learning, in ICLR, 2019. S. Ravi and A. Beatson. paper
  10. Meta-learning with latent embedding optimization, in ICLR, 2019. A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pascanu, S. Osindero, and R. Hadsell. paper code
  11. Meta relational learning for few-shot link prediction in knowledge graphs, in EMNLP-IJCNLP, 2019. M. Chen, W. Zhang, W. Zhang, Q. Chen, and H. Chen. paper
  12. Adapting meta knowledge graph information for multi-hop reasoning over few-shot relations, in EMNLP-IJCNLP, 2019. X. Lv, Y. Gu, X. Han, L. Hou, J. Li, and Z. Liu. paper
  13. LGM-Net: Learning to generate matching networks for few-shot learning, in ICML, 2019. H. Li, W. Dong, X. Mei, C. Ma, F. Huang, and B.-G. Hu. paper code
  14. Meta R-CNN: Towards general solver for instance-level low-shot learning, in ICCV, 2019. X. Yan, Z. Chen, A. Xu, X. Wang, X. Liang, and L. Lin. paper
  15. Task agnostic meta-learning for few-shot learning, in CVPR, 2019. M. A. Jamal, and G.-J. Qi. paper
  16. Meta-transfer learning for few-shot learning, in CVPR, 2019. Q. Sun, Y. Liu, T.-S. Chua, and B. Schiele. paper code
  17. Meta-learning of neural architectures for few-shot learning, in CVPR, 2020. T. Elsken, B. Staffler, J. H. Metzen, and F. Hutter. paper
  18. Attentive weights generation for few shot learning via information maximization, in CVPR, 2020. Y. Guo, and N.-M. Cheung. paper
  19. Few-shot open-set recognition using meta-learning, in CVPR, 2020. B. Liu, H. Kang, H. Li, G. Hua, and N. Vasconcelos. paper
  20. Incremental few-shot object detection, in CVPR, 2020. J.-M. Perez-Rua, X. Zhu, T. M. Hospedales, and T. Xiang. paper
  21. Automated relational meta-learning, in ICLR, 2020. H. Yao, X. Wu, Z. Tao, Y. Li, B. Ding, R. Li, and Z. Li. paper
  22. Meta-learning with warped gradient descent, in ICLR, 2020. S. Flennerhag, A. A. Rusu, R. Pascanu, F. Visin, H. Yin, and R. Hadsell. paper
  23. Meta-learning without memorization, in ICLR, 2020. M. Yin, G. Tucker, M. Zhou, S. Levine, and C. Finn. paper
  24. ES-MAML: Simple Hessian-free meta learning, in ICLR, 2020. X. Song, W. Gao, Y. Yang, K. Choromanski, A. Pacchiano, and Y. Tang. paper
  25. Self-supervised tuning for few-shot segmentation, in IJCAI, 2020. K. Zhu, W. Zhai, and Y. Cao. paper
  26. Multi-attention meta learning for few-shot fine-grained image recognition, in IJCAI, 2020. Y. Zhu, C. Liu, and S. Jiang. paper
  27. An ensemble of epoch-wise empirical Bayes for few-shot learning, in ECCV, 2020. Y. Liu, B. Schiele, and Q. Sun. paper code
  28. Incremental few-shot meta-learning via indirect discriminant alignment, in ECCV, 2020. Q. Liu, O. Majumder, A. Achille, A. Ravichandran, R. Bhotika, and S. Soatto. paper
  29. Model-agnostic boundary-adversarial sampling for test-time generalization in few-shot learning, in ECCV, 2020. J. Kim, H. Kim, and G. Kim. paper code
  30. Bayesian meta-learning for the few-shot setting via deep kernels, in NeurIPS, 2020. M. Patacchiola, J. Turner, E. J. Crowley, M. O’Boyle, and A. J. Storkey. paper code
  31. OOD-MAML: Meta-learning for few-shot out-of-distribution detection and classification, in NeurIPS, 2020. T. Jeong, and H. Kim. paper code
  32. Unraveling meta-learning: Understanding feature representations for few-shot tasks, in ICML, 2020. M. Goldblum, S. Reich, L. Fowl, R. Ni, V. Cherepanova, and T. Goldstein. paper code
  33. Node classification on graphs with few-shot novel labels via meta transformed network embedding, in NeurIPS, 2020. L. Lan, P. Wang, X. Du, K. Song, J. Tao, and X. Guan. paper
  34. Adversarially robust few-shot learning: A meta-learning approach, in NeurIPS, 2020. M. Goldblum, L. Fowl, and T. Goldstein. paper code
  35. BOIL: Towards representation change for few-shot learning, in ICLR, 2021. J. Oh, H. Yoo, C. Kim, and S. Yun. paper code
  36. Few-shot open-set recognition by transformation consistency, in CVPR, 2021. M. Jeong, S. Choi, and C. Kim. paper
  37. Improving generalization in meta-learning via task augmentation, in ICML, 2021. H. Yao, L. Huang, L. Zhang, Y. Wei, L. Tian, J. Zou, J. Huang, and Z. Li. paper
  38. A representation learning perspective on the importance of train-validation splitting in meta-learning, in ICML, 2021. N. Saunshi, A. Gupta, and W. Hu. paper code
  39. Data augmentation for meta-learning, in ICML, 2021. R. Ni, M. Goldblum, A. Sharaf, K. Kong, and T. Goldstein. paper code
  40. Task cooperation for semi-supervised few-shot learning, in AAAI, 2021. H. Ye, X. Li, and D.-C. Zhan. paper
  41. Conditional self-supervised learning for few-shot classification, in IJCAI, 2021. Y. An, H. Xue, X. Zhao, and L. Zhang. paper
  42. Cross-domain few-shot classification via adversarial task augmentation, in IJCAI, 2021. H. Wang, and Z.-H. Deng. paper code
  43. DReCa: A general task augmentation strategy for few-shot natural language inference, in NAACL-HLT, 2021. S. Murty, T. Hashimoto, and C. D. Manning. paper
  44. MetaXL: Meta representation transformation for low-resource cross-lingual learning, in NAACL-HLT, 2021. M. Xia, G. Zheng, S. Mukherjee, M. Shokouhi, G. Neubig, and A. H. Awadallah. paper code
  45. Meta-learning with task-adaptive loss function for few-shot learning, in ICCV, 2021. S. Baik, J. Choi, H. Kim, D. Cho, J. Min, and K. M. Lee. paper code
  46. Meta-Baseline: Exploring simple meta-learning for few-shot learning, in ICCV, 2021. Y. Chen, Z. Liu, H. Xu, T. Darrell, and X. Wang. paper
  47. A lazy approach to long-horizon gradient-based meta-learning, in ICCV, 2021. M. A. Jamal, L. Wang, and B. Gong. paper
  48. Task-aware part mining network for few-shot learning, in ICCV, 2021. J. Wu, T. Zhang, Y. Zhang, and F. Wu. paper
  49. Binocular mutual learning for improving few-shot classification, in ICCV, 2021. Z. Zhou, X. Qiu, J. Xie, J. Wu, and C. Zhang. paper code
  50. Meta-learning with an adaptive task scheduler, in NeurIPS, 2021. H. Yao, Y. Wang, Y. Wei, P. Zhao, M. Mahdavi, D. Lian, and C. Finn. paper
  51. Memory efficient meta-learning with large images, in NeurIPS, 2021. J. Bronskill, D. Massiceti, M. Patacchiola, K. Hofmann, S. Nowozin, and R. Turner. paper
  52. EvoGrad: Efficient gradient-based meta-learning and hyperparameter optimization, in NeurIPS, 2021. O. Bohdal, Y. Yang, and T. Hospedales. paper
  53. Towards enabling meta-learning from target models, in NeurIPS, 2021. S. Lu, H. Ye, L. Gan, and D. Zhan. paper
  54. The role of global labels in few-shot classification and how to infer them, in NeurIPS, 2021. R. Wang, M. Pontil, and C. Ciliberto. paper
  55. How to train your MAML to excel in few-shot classification, in ICLR, 2022. H. Ye, and W. Chao. paper code
  56. Meta-learning with fewer tasks through task interpolation, in ICLR, 2022. H. Yao, L. Zhang, and C. Finn. paper code
  57. Continuous-time meta-learning with forward mode differentiation, in ICLR, 2022. T. Deleu, D. Kanaa, L. Feng, G. Kerg, Y. Bengio, G. Lajoie, and P. Bacon. paper
  58. Bootstrapped meta-learning, in ICLR, 2022. S. Flennerhag, Y. Schroecker, T. Zahavy, H. v. Hasselt, D. Silver, and S. Singh. paper
  59. Learning prototype-oriented set representations for meta-learning, in ICLR, 2022. D. d. Guo, L. Tian, M. Zhang, M. Zhou, and H. Zha. paper
  60. Dynamic kernel selection for improved generalization and memory efficiency in meta-learning, in CVPR, 2022. A. Chavan, R. Tiwari, U. Bamba, and D. K. Gupta. paper code
  61. What matters for meta-learning vision regression tasks?, in CVPR, 2022. N. Gao, H. Ziesche, N. A. Vien, M. Volpp, and G. Neumann. paper code
  62. Multidimensional belief quantification for label-efficient meta-learning, in CVPR, 2022. D. S. Pandey, and Q. Yu. paper

Learning Search Steps

  1. Optimization as a model for few-shot learning, in ICLR, 2017. S. Ravi and H. Larochelle. paper code
  2. Meta Navigator: Search for a good adaptation policy for few-shot learning, in ICCV, 2021. C. Zhang, H. Ding, G. Lin, R. Li, C. Wang, and C. Shen. paper

Applications

Computer Vision

  1. Learning robust visual-semantic embeddings, in CVPR, 2017. Y.-H. Tsai, L.-K. Huang, and R. Salakhutdinov. paper
  2. One-shot action localization by learning sequence matching network, in CVPR, 2018. H. Yang, X. He, and F. Porikli. paper
  3. Incremental few-shot learning for pedestrian attribute recognition, in EMNLP, 2018. L. Xiang, X. Jin, G. Ding, J. Han, and L. Li. paper
  4. Few-shot video-to-video synthesis, in NeurIPS, 2019. T.-C. Wang, M.-Y. Liu, A. Tao, G. Liu, J. Kautz, and B. Catanzaro. paper code
  5. Few-shot object detection via feature reweighting, in ICCV, 2019. B. Kang, Z. Liu, X. Wang, F. Yu, J. Feng, and T. Darrell. paper code
  6. Few-shot unsupervised image-to-image translation, in ICCV, 2019. M.-Y. Liu, X. Huang, A. Mallya, T. Karras, T. Aila, J. Lehtinen, and J. Kautz. paper code
  7. Feature weighting and boosting for few-shot segmentation, in ICCV, 2019. K. Nguyen, and S. Todorovic. paper
  8. Few-shot adaptive gaze estimation, in ICCV, 2019. S. Park, S. D. Mello, P. Molchanov, U. Iqbal, O. Hilliges, and J. Kautz. paper
  9. AMP: Adaptive masked proxies for few-shot segmentation, in ICCV, 2019. M. Siam, B. N. Oreshkin, and M. Jagersand. paper code
  10. Few-shot generalization for single-image 3D reconstruction via priors, in ICCV, 2019. B. Wallace, and B. Hariharan. paper
  11. Few-shot adversarial learning of realistic neural talking head models, in ICCV, 2019. E. Zakharov, A. Shysheya, E. Burkov, and V. Lempitsky. paper code
  12. Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation, in ICCV, 2019. C. Zhang, G. Lin, F. Liu, J. Guo, Q. Wu, and R. Yao. paper
  13. Time-conditioned action anticipation in one shot, in CVPR, 2019. Q. Ke, M. Fritz, and B. Schiele. paper
  14. Few-shot learning with localization in realistic settings, in CVPR, 2019. D. Wertheimer, and B. Hariharan. paper code
  15. Improving few-shot user-specific gaze adaptation via gaze redirection synthesis, in CVPR, 2019. Y. Yu, G. Liu, and J.-M. Odobez. paper
  16. CANet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning, in CVPR, 2019. C. Zhang, G. Lin, F. Liu, R. Yao, and C. Shen. paper code
  17. Multi-level Semantic Feature Augmentation for One-shot Learning, in TIP, 2019. Z. Chen, Y. Fu, Y. Zhang, Y.-G. Jiang, X. Xue, and L. Sigal. paper code
  18. Few-shot pill recognition, in CVPR, 2020. S. Ling, A. Pastor, J. Li, Z. Che, J. Wang, J. Kim, and P. L. Callet. paper
  19. LT-Net: Label transfer by learning reversible voxel-wise correspondence for one-shot medical image segmentation, in CVPR, 2020. S. Wang, S. Cao, D. Wei, R. Wang, K. Ma, L. Wang, D. Meng, and Y. Zheng. paper
  20. 3FabRec: Fast few-shot face alignment by reconstruction, in CVPR, 2020. B. Browatzki, and C. Wallraven. paper
  21. Few-shot video classification via temporal alignment, in CVPR, 2020. K. Cao, J. Ji, Z. Cao, C.-Y. Chang, J. C. Niebles. paper
  22. One-shot adversarial attacks on visual tracking with dual attention, in CVPR, 2020. X. Chen, X. Yan, F. Zheng, Y. Jiang, S.-T. Xia, Y. Zhao, and R. Ji. paper
  23. FGN: Fully guided network for few-shot instance segmentation, in CVPR, 2020. Z. Fan, J.-G. Yu, Z. Liang, J. Ou, C. Gao, G.-S. Xia, and Y. Li. paper
  24. CRNet: Cross-reference networks for few-shot segmentation, in CVPR, 2020. W. Liu, C. Zhang, G. Lin, and F. Liu. paper
  25. Revisiting pose-normalization for fine-grained few-shot recognition, in CVPR, 2020. L. Tang, D. Wertheimer, and B. Hariharan. paper
  26. Few-shot learning of part-specific probability space for 3D shape segmentation, in CVPR, 2020. L. Wang, X. Li, and Y. Fang. paper
  27. Semi-supervised learning for few-shot image-to-image translation, in CVPR, 2020. Y. Wang, S. Khan, A. Gonzalez-Garcia, J. van de Weijer, and F. S. Khan. paper
  28. Multi-domain learning for accurate and few-shot color constancy, in CVPR, 2020. J. Xiao, S. Gu, and L. Zhang. paper
  29. One-shot domain adaptation for face generation, in CVPR, 2020. C. Yang, and S.-N. Lim. paper
  30. MetaPix: Few-shot video retargeting, in ICLR, 2020. J. Lee, D. Ramanan, and R. Girdhar. paper
  31. Few-shot human motion prediction via learning novel motion dynamics, in IJCAI, 2020. C. Zang, M. Pei, and Y. Kong. paper
  32. Shaping visual representations with language for few-shot classification, in ACL, 2020. J. Mu, P. Liang, and N. D. Goodman. paper
  33. MarioNETte: Few-shot face reenactment preserving identity of unseen targets, in AAAI, 2020. S. Ha, M. Kersner, B. Kim, S. Seo, and D. Kim. paper
  34. One-shot learning for long-tail visual relation detection, in AAAI, 2020. W. Wang, M. Wang, S. Wang, G. Long, L. Yao, G. Qi, and Y. Chen. paper code
  35. Differentiable meta-learning model for few-shot semantic segmentation, in AAAI, 2020. P. Tian, Z. Wu, L. Qi, L. Wang, Y. Shi, and Y. Gao. paper
  36. Part-aware prototype network for few-shot semantic segmentation, in ECCV, 2020. Y. Liu, X. Zhang, S. Zhang, and X. He. paper code
  37. Prototype mixture models for few-shot semantic segmentation, in ECCV, 2020. B. Yang, C. Liu, B. Li, J. Jiao, and Q. Ye. paper code
  38. Self-supervision with superpixels: Training few-shot medical image segmentation without annotation, in ECCV, 2020. C. Ouyang, C. Biffi, C. Chen, T. Kart, H. Qiu, and D. Rueckert. paper code
  39. Few-shot action recognition with permutation-invariant attention, in ECCV, 2020. H. Zhang, L. Zhang, X. Qi, H. Li, P. H. S. Torr, and P. Koniusz. paper
  40. Few-shot compositional font generation with dual memory, in ECCV, 2020. J. Cha, S. Chun, G. Lee, B. Lee, S. Kim, and H. Lee. paper code
  41. Few-shot object detection and viewpoint estimation for objects in the wild, in ECCV, 2020. Y. Xiao, and R. Marlet. paper
  42. Few-shot scene-adaptive anomaly detection, in ECCV, 2020. Y. Lu, F. Yu, M. K. K. Reddy, and Y. Wang. paper code
  43. Few-shot semantic segmentation with democratic attention networks, in ECCV, 2020. H. Wang, X. Zhang, Y. Hu, Y. Yang, X. Cao, and X. Zhen. paper
  44. Few-shot single-view 3-D object reconstruction with compositional priors, in ECCV, 2020. M. Michalkiewicz, S. Parisot, S. Tsogkas, M. Baktashmotlagh, A. Eriksson, and E. Belilovsky. paper
  45. COCO-FUNIT: Few-shot unsupervised image translation with a content conditioned style encoder, in ECCV, 2020. K. Saito, K. Saenko, and M. Liu. paper code
  46. Deep complementary joint model for complex scene registration and few-shot segmentation on medical images, in ECCV, 2020. Y. He, T. Li, G. Yang, Y. Kong, Y. Chen, H. Shu, J. Coatrieux, J. Dillenseger, and S. Li. paper
  47. Multi-scale positive sample refinement for few-shot object detection, in ECCV, 2020. J. Wu, S. Liu, D. Huang, and Y. Wang. paper code
  48. Large-scale few-shot learning via multi-modal knowledge discovery, in ECCV, 2020. S. Wang, J. Yue, J. Liu, Q. Tian, and M. Wang. paper
  49. Graph convolutional networks for learning with few clean and many noisy labels, in ECCV, 2020. A. Iscen, G. Tolias, Y. Avrithis, O. Chum, and C. Schmid. paper
  50. Self-supervised few-shot learning on point clouds, in NeurIPS, 2020. C. Sharma, and M. Kaul. paper code
  51. Restoring negative information in few-shot object detection, in NeurIPS, 2020. Y. Yang, F. Wei, M. Shi, and G. Li. paper code
  52. Few-shot image generation with elastic weight consolidation, in NeurIPS, 2020. Y. Li, R. Zhang, J. Lu, and E. Shechtman. paper
  53. Few-shot visual reasoning with meta-analogical contrastive learning, in NeurIPS, 2020. Y. Kim, J. Shin, E. Yang, and S. J. Hwang. paper
  54. CrossTransformers: spatially-aware few-shot transfer, in NeurIPS, 2020. C. Doersch, A. Gupta, and A. Zisserman. paper
  55. Make one-shot video object segmentation efficient again, in NeurIPS, 2020. T. Meinhardt, and L. Leal-Taixé. paper code
  56. Frustratingly simple few-shot object detection, in ICML, 2020. X. Wang, T. E. Huang, J. Gonzalez, T. Darrell, and F. Yu. paper code
  57. Adversarial style mining for one-shot unsupervised domain adaptation, in NeurIPS, 2020. Y. Luo, P. Liu, T. Guan, J. Yu, and Y. Yang. paper code
  58. Disentangling 3D prototypical networks for few-shot concept learning, in ICLR, 2021. M. Prabhudesai, S. Lal, D. Patil, H. Tung, A. W. Harley, and K. Fragkiadaki. paper
  59. Learning normal dynamics in videos with meta prototype network, in CVPR, 2021. H. Lv, C. Chen, Z. Cui, C. Xu, Y. Li, and J. Yang. paper code
  60. Learning dynamic alignment via meta-filter for few-shot learning, in CVPR, 2021. C. Xu, Y. Fu, C. Liu, C. Wang, J. Li, F. Huang, L. Zhang, and X. Xue. paper
  61. Delving deep into many-to-many attention for few-shot video object segmentation, in CVPR, 2021. H. Chen, H. Wu, N. Zhao, S. Ren, and S. He. paper code
  62. Adaptive prototype learning and allocation for few-shot segmentation, in CVPR, 2021. G. Li, V. Jampani, L. Sevilla-Lara, D. Sun, J. Kim, and J. Kim. paper code
  63. FAPIS: A few-shot anchor-free part-based instance segmenter, in CVPR, 2021. K. Nguyen, and S. Todorovic. paper
  64. FSCE: Few-shot object detection via contrastive proposal encoding, in CVPR, 2021. B. Sun, B. Li, S. Cai, Y. Yuan, and C. Zhang. paper code
  65. Few-shot 3D point cloud semantic segmentation, in CVPR, 2021. N. Zhao, T. Chua, and G. H. Lee. paper code
  66. Generalized few-shot object detection without forgetting, in CVPR, 2021. Z. Fan, Y. Ma, Z. Li, and J. Sun. paper
  67. Few-shot human motion transfer by personalized geometry and texture modeling, in CVPR, 2021. Z. Huang, X. Han, J. Xu, and T. Zhang. paper code
  68. Labeled from unlabeled: Exploiting unlabeled data for few-shot deep HDR deghosting, in CVPR, 2021. K. R. Prabhakar, G. Senthil, S. Agrawal, R. V. Babu, and R. K. S. S. Gorthi. paper
  69. Few-shot transformation of common actions into time and space, in CVPR, 2021. P. Yang, P. Mettes, and C. G. M. Snoek. paper code
  70. Temporal-relational CrossTransformers for few-shot action recognition, in CVPR, 2021. T. Perrett, A. Masullo, T. Burghardt, M. Mirmehdi, and D. Damen. paper
  71. pixelNeRF: Neural radiance fields from one or few images, in CVPR, 2021. A. Yu, V. Ye, M. Tancik, and A. Kanazawa. paper code
  72. Hallucination improves few-shot object detection, in CVPR, 2021. W. Zhang, and Y. Wang. paper
  73. Few-shot object detection via classification refinement and distractor retreatment, in CVPR, 2021. Y. Li, H. Zhu, Y. Cheng, W. Wang, C. S. Teo, C. Xiang, P. Vadakkepat, and T. H. Lee. paper
  74. Dense relation distillation with context-aware aggregation for few-shot object detection, in CVPR, 2021. H. Hu, S. Bai, A. Li, J. Cui, and L. Wang. paper code
  75. Few-shot segmentation without meta-learning: A good transductive inference is all you need? , in CVPR, 2021. M. Boudiaf, H. Kervadec, Z. I. Masud, P. Piantanida, I. B. Ayed, and J. Dolz. paper code
  76. Few-shot image generation via cross-domain correspondence, in CVPR, 2021. U. Ojha, Y. Li, J. Lu, A. A. Efros, Y. J. Lee, E. Shechtman, and R. Zhang. paper
  77. Self-guided and cross-guided learning for few-shot segmentation, in CVPR, 2021. B. Zhang, J. Xiao, and T. Qin. paper code
  78. Anti-aliasing semantic reconstruction for few-shot semantic segmentation, in CVPR, 2021. B. Liu, Y. Ding, J. Jiao, X. Ji, and Q. Ye. paper
  79. Beyond max-margin: Class margin equilibrium for few-shot object detection, in CVPR, 2021. B. Li, B. Yang, C. Liu, F. Liu, R. Ji, and Q. Ye. paper code
  80. Incremental few-shot instance segmentation, in CVPR, 2021. D. A. Ganea, B. Boom, and R. Poppe. paper code
  81. Scale-aware graph neural network for few-shot semantic segmentation, in CVPR, 2021. G. Xie, J. Liu, H. Xiong, and L. Shao. paper
  82. Semantic relation reasoning for shot-stable few-shot object detection, in CVPR, 2021. C. Zhu, F. Chen, U. Ahmed, Z. Shen, and M. Savvides. paper
  83. Accurate few-shot object detection with support-query mutual guidance and hybrid loss, in CVPR, 2021. L. Zhang, S. Zhou, J. Guan, and J. Zhang. paper
  84. Transformation invariant few-shot object detection, in CVPR, 2021. A. Li, and Z. Li. paper
  85. MetaHTR: Towards writer-adaptive handwritten text recognition, in CVPR, 2021. A. K. Bhunia, S. Ghose, A. Kumar, P. N. Chowdhury, A. Sain, and Y. Song. paper
  86. What if we only use real datasets for scene text recognition? Toward scene text recognition with fewer labels, in CVPR, 2021. J. Baek, Y. Matsui, and K. Aizawa. paper code
  87. Few-shot font generation with localized style representations and factorization, in AAAI, 2021. S. Park, S. Chun, J. Cha, B. Lee, and H. Shim. paper code
  88. Attributes-guided and pure-visual attention alignment for few-shot recognition, in AAAI, 2021. S. Huang, M. Zhang, Y. Kang, and D. Wang. paper code
  89. One-shot face reenactment using appearance adaptive normalization, in AAAI, 2021. G. Yao, Y. Yuan, T. Shao, S. Li, S. Liu, Y. Liu, M. Wang, and K. Zhou. paper
  90. FL-MSRE: A few-shot learning based approach to multimodal social relation extraction, in AAAI, 2021. H. Wan, M. Zhang, J. Du, Z. Huang, Y. Yang, and J. Z. Pan. paper code
  91. StarNet: Towards weakly supervised few-shot object detection, in AAAI, 2021. L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. Bronstein, and R. Giryes. paper code
  92. Progressive one-shot human parsing, in AAAI, 2021. H. He, J. Zhang, B. Thuraisingham, and D. Tao. paper code
  93. Knowledge is power: Hierarchical-knowledge embedded meta-learning for visual reasoning in artistic domains, in KDD, 2021. W. Zheng, L. Yan, C. Gou, and F.-Y. Wang. paper
  94. MEDA: Meta-learning with data augmentation for few-shot text classification, in IJCAI, 2021. P. Sun, Y. Ouyang, W. Zhang, and X.-Y. Dai. paper
  95. Learning implicit temporal alignment for few-shot video classification, in IJCAI, 2021. S. Zhang, J. Zhou, and X. He. paper code
  96. Few-shot neural human performance rendering from sparse RGBD videos, in IJCAI, 2021. A. Pang, X. Chen, H. Luo, M. Wu, J. Yu, and L. Xu. paper
  97. Uncertainty-aware few-shot image classification, in IJCAI, 2021. Z. Zhang, C. Lan, W. Zeng, Z. Chen, and S. Chan. paper
  98. Few-shot learning with part discovery and augmentation from unlabeled images, in IJCAI, 2021. W. Chen, C. Si, W. Wang, L. Wang, Z. Wang, and T. Tan. paper
  99. Few-shot partial-label learning, in IJCAI, 2021. Y. Zhao, G. Yu, L. Liu, Z. Yan, L. Cui, and C. Domeniconi. paper
  100. One-shot affordance detection, in IJCAI, 2021. H. Luo, W. Zhai, J. Zhang, Y. Cao, and D. Tao. paper
  101. DeFRCN: Decoupled faster R-CNN for few-shot object detection, in ICCV, 2021. L. Qiao, Y. Zhao, Z. Li, X. Qiu, J. Wu, and C. Zhang. paper
  102. Learning meta-class memory for few-shot semantic segmentation, in ICCV, 2021. Z. Wu, X. Shi, G. Lin, and J. Cai. paper
  103. UVStyle-Net: Unsupervised few-shot learning of 3D style similarity measure for B-Reps, in ICCV, 2021. P. Meltzer, H. Shayani, A. Khasahmadi, P. K. Jayaraman, A. Sanghi, and J. Lambourne. paper
  104. LoFGAN: Fusing local representations for few-shot image generation, in ICCV, 2021. Z. Gu, W. Li, J. Huo, L. Wang, and Y. Gao. paper
  105. Recurrent mask refinement for few-shot medical image segmentation, in ICCV, 2021. H. Tang, X. Liu, S. Sun, X. Yan, and X. Xie. paper code
  106. H3D-Net: Few-shot high-fidelity 3D head reconstruction, in ICCV, 2021. E. Ramon, G. Triginer, J. Escur, A. Pumarola, J. Garcia, X. Giró-i-Nieto, and F. Moreno-Noguer. paper
  107. Learned spatial representations for few-shot talking-head synthesis, in ICCV, 2021. M. Meshry, S. Suri, L. S. Davis, and A. Shrivastava. paper
  108. Putting NeRF on a diet: Semantically consistent few-shot view synthesis, in ICCV, 2021. A. Jain, M. Tancik, and P. Abbeel. paper
  109. Hypercorrelation squeeze for few-shot segmentation, in ICCV, 2021. J. Min, D. Kang, and M. Cho. paper code
  110. Few-shot semantic segmentation with cyclic memory network, in ICCV, 2021. G. Xie, H. Xiong, J. Liu, Y. Yao, and L. Shao. paper
  111. Simpler is better: Few-shot semantic segmentation with classifier weight transformer, in ICCV, 2021. Z. Lu, S. He, X. Zhu, L. Zhang, Y. Song, and T. Xiang. paper code
  112. Unsupervised few-shot action recognition via action-appearance aligned meta-adaptation, in ICCV, 2021. J. Patravali, G. Mittal, Y. Yu, F. Li, and M. Chen. paper
  113. Multiple heads are better than one: few-shot font generation with multiple localized experts, in ICCV, 2021. S. Park, S. Chun, J. Cha, B. Lee, and H. Shim. paper code
  114. Mining latent classes for few-shot segmentation, in ICCV, 2021. L. Yang, W. Zhuo, L. Qi, Y. Shi, and Y. Gao. paper code
  115. Partner-assisted learning for few-shot image classification, in ICCV, 2021. J. Ma, H. Xie, G. Han, S. Chang, A. Galstyan, and W. Abd-Almageed. paper
  116. Hierarchical graph attention network for few-shot visual-semantic learning, in ICCV, 2021. C. Yin, K. Wu, Z. Che, B. Jiang, Z. Xu, and J. Tang. paper
  117. Video pose distillation for few-shot, fine-grained sports action recognition, in ICCV, 2021. J. Hong, M. Fisher, M. Gharbi, and K. Fatahalian. paper
  118. Universal-prototype enhancing for few-shot object detection, in ICCV, 2021. A. Wu, Y. Han, L. Zhu, and Y. Yang. paper code
  119. Query adaptive few-shot object detection with heterogeneous graph convolutional networks, in ICCV, 2021. G. Han, Y. He, S. Huang, J. Ma, and S. Chang. paper
  120. Few-shot visual relationship co-localization, in ICCV, 2021. R. Teotia, V. Mishra, M. Maheshwari, and A. Mishra. paper code
  121. Shallow Bayesian meta learning for real-world few-shot recognition, in ICCV, 2021. X. Zhang, D. Meng, H. Gouk, and T. M. Hospedales. paper code
  122. Super-resolving cross-domain face miniatures by peeking at one-shot exemplar, in ICCV, 2021. P. Li, X. Yu, and Y. Yang. paper
  123. Few-shot segmentation via cycle-consistent transformer, in NeurIPS, 2021. G. Zhang, G. Kang, Y. Yang, and Y. Wei. paper
  124. Generalized and discriminative few-shot object detection via SVD-dictionary enhancement, in NeurIPS, 2021. A. WU, S. Zhao, C. Deng, and W. Liu. paper
  125. Re-ranking for image retrieval and transductive few-shot classification, in NeurIPS, 2021. X. SHEN, Y. Xiao, S. Hu, O. Sbai, and M. Aubry. paper
  126. Neural view synthesis and matching for semi-supervised few-shot learning of 3D pose, in NeurIPS, 2021. A. Wang, S. Mei, A. L. Yuille, and A. Kortylewski. paper
  127. MetaAvatar: Learning animatable clothed human models from few depth images, in NeurIPS, 2021. S. Wang, M. Mihajlovic, Q. Ma, A. Geiger, and S. Tang. paper
  128. Few-shot object detection via association and discrimination, in NeurIPS, 2021. Y. Cao, J. Wang, Y. Jin, T. Wu, K. Chen, Z. Liu, and D. Lin. paper
  129. Rectifying the shortcut learning of background for few-shot learning, in NeurIPS, 2021. X. Luo, L. Wei, L. Wen, J. Yang, L. Xie, Z. Xu, and Q. Tian. paper
  130. D2C: Diffusion-decoding models for few-shot conditional generation, in NeurIPS, 2021. A. Sinha, J. Song, C. Meng, and S. Ermon. paper
  131. Few-shot backdoor attacks on visual object tracking, in ICLR, 2022. Y. Li, H. Zhong, X. Ma, Y. Jiang, and S. Xia. paper code
  132. Temporal alignment prediction for supervised representation learning and few-shot sequence classification, in ICLR, 2022. B. Su, and J. Wen. paper code
  133. Learning non-target knowledge for few-shot semantic segmentation, in CVPR, 2022. Y. Liu, N. Liu, Q. Cao, X. Yao, J. Han, and L. Shao. paper
  134. Learning what not to segment: A new perspective on few-shot segmentation, in CVPR, 2022. C. Lang, G. Cheng, B. Tu, and J. Han. paper code
  135. Few-shot keypoint detection with uncertainty learning for unseen species, in CVPR, 2022. C. Lu, and P. Koniusz. paper
  136. XMP-Font: Self-supervised cross-modality pre-training for few-shot font generation, in CVPR, 2022. W. Liu, F. Liu, F. Ding, Q. He, and Z. Yi. paper
  137. Spatio-temporal relation modeling for few-shot action recognition, in CVPR, 2022. A. Thatipelli, S. Narayan, S. Khan, R. M. Anwer, F. S. Khan, and B. Ghanem. paper code
  138. Attribute group editing for reliable few-shot image generation, in CVPR, 2022. G. Ding, X. Han, S. Wang, S. Wu, X. Jin, D. Tu, and Q. Huang. paper code
  139. Few-shot backdoor defense using Shapley estimation, in CVPR, 2022. J. Guan, Z. Tu, R. He, and D. Tao. paper
  140. Hybrid relation guided set matching for few-shot action recognition, in CVPR, 2022. X. Wang, S. Zhang, Z. Qing, M. Tang, Z. Zuo, C. Gao, R. Jin, and N. Sang. paper code
  141. Label, verify, correct: A simple few shot object detection method, in CVPR, 2022. P. Kaul, W. Xie, and A. Zisserman. paper
  142. InfoNeRF: Ray entropy minimization for few-shot neural volume rendering, in CVPR, 2022. M. Kim, S. Seo, and B. Han. paper
  143. A closer look at few-shot image generation, in CVPR, 2022. Y. Zhao, H. Ding, H. Huang, and N. Cheung. paper code
  144. Motion-modulated temporal fragment alignment network for few-shot action recognition, in CVPR, 2022. J. Wu, T. Zhang, Z. Zhang, F. Wu, and Y. Zhang. paper
  145. Kernelized few-shot object detection with efficient integral aggregation, in CVPR, 2022. S. Zhang, L. Wang, N. Murray, and P. Koniusz. paper code
  146. FS6D: Few-shot 6D pose estimation of novel objects, in CVPR, 2022. Y. He, Y. Wang, H. Fan, J. Sun, and Q. Chen. paper
  147. Look closer to supervise better: One-shot font generation via component-based discriminator, in CVPR, 2022. Y. Kong, C. Luo, W. Ma, Q. Zhu, S. Zhu, N. Yuan, and L. Jin. paper
  148. Generalized few-shot semantic segmentation, in CVPR, 2022. Z. Tian, X. Lai, L. Jiang, S. Liu, M. Shu, H. Zhao, and J. Jia. paper code
  149. Which images to label for few-shot medical landmark detection?, in CVPR, 2022. Q. Quan, Q. Yao, J. Li, and S. K. Zhou. paper
  150. Dynamic prototype convolution network for few-shot semantic segmentation, in CVPR, 2022. J. Liu, Y. Bao, G. Xie, H. Xiong, J. Sonke, and E. Gavves. paper
  151. OSOP: A multi-stage one shot object pose estimation framework, in CVPR, 2022. I. Shugurov, F. Li, B. Busam, and S. Ilic. paper
  152. Semantic-aligned fusion transformer for one-shot object detection, in CVPR, 2022. Y. Zhao, X. Guo, and Y. Lu. paper
  153. OnePose: One-shot object pose estimation without CAD models, in CVPR, 2022. J. Sun, Z. Wang, S. Zhang, X. He, H. Zhao, G. Zhang, and X. Zhou. paper code
  154. Few-shot object detection with fully cross-transformer, in CVPR, 2022. G. Han, J. Ma, S. Huang, L. Chen, and S. Chang. paper
  155. Learning to memorize feature hallucination for one-shot image generation, in CVPR, 2022. Y. Xie, Y. Fu, Y. Tai, Y. Cao, J. Zhu, and C. Wang. paper
  156. Few-shot font generation by learning fine-grained local styles, in CVPR, 2022. L. Tang, Y. Cai, J. Liu, Z. Hong, M. Gong, M. Fan, J. Han, J. Liu, E. Ding, and J. Wang. paper
  157. Balanced and hierarchical relation learning for one-shot object detection, in CVPR, 2022. H. Yang, S. Cai, H. Sheng, B. Deng, J. Huang, X. Hua, Y. Tang, and Y. Zhang. paper
  158. Few-shot head swapping in the wild, in CVPR, 2022. C. Shu, H. Wu, H. Zhou, J. Liu, Z. Hong, C. Ding, J. Han, J. Liu, E. Ding, and J. Wang. paper
  159. Integrative few-shot learning for classification and segmentation, in CVPR, 2022. D. Kang, and M. Cho. paper
  160. Attribute surrogates learning and spectral tokens pooling in transformers for few-shot learning, in CVPR, 2022. Y. He, W. Liang, D. Zhao, H. Zhou, W. Ge, Y. Yu, and W. Zhang. paper code
  161. Task discrepancy maximization for fine-grained few-shot classification, in CVPR, 2022. S. Lee, W. Moon, and J. Heo. paper

Robotics

  1. Towards one shot learning by imitation for humanoid robots, in ICRA, 2010. Y. Wu and Y. Demiris. paper
  2. Learning manipulation actions from a few demonstrations, in ICRA, 2013. N. Abdo, H. Kretzschmar, L. Spinello, and C. Stachniss. paper
  3. Learning assistive strategies from a few user-robot interactions: Model-based reinforcement learning approach, in ICRA, 2016. M. Hamaya, T. Matsubara, T. Noda, T. Teramae, and J. Morimoto. paper
  4. One-shot imitation learning, in NeurIPS, 2017. Y. Duan, M. Andrychowicz, B. Stadie, J. Ho, J. Schneider, I. Sutskever, P. Abbeel, and W. Zaremba. paper
  5. Meta-learning language-guided policy learning, in ICLR, 2019. J. D. Co-Reyes, A. Gupta, S. Sanjeev, N. Altieri, J. DeNero, P. Abbeel, and S. Levine. paper
  6. Meta reinforcement learning with autonomous inference of subtask dependencies, in ICLR, 2020. S. Sohn, H. Woo, J. Choi, and H. Lee. paper
  7. Watch, try, learn: Meta-learning from demonstrations and rewards, in ICLR, 2020. A. Zhou, E. Jang, D. Kappler, A. Herzog, M. Khansari, P. Wohlhart, Y. Bai, M. Kalakrishnan, S. Levine, and C. Finn. paper
  8. Few-shot Bayesian imitation learning with logical program policies, in AAAI, 2020. T. Silver, K. R. Allen, A. K. Lew, L. P. Kaelbling, and J. Tenenbaum. paper
  9. One solution is not all you need: Few-shot extrapolation via structured MaxEnt RL, in NeurIPS, 2020. S. Kumar, A. Kumar, S. Levine, and C. Finn. paper
  10. Bowtie networks: Generative modeling for joint few-shot recognition and novel-view synthesis, in ICLR, 2021. Z. Bao, Y. Wang, and M. Hebert. paper
  11. Demonstration-conditioned reinforcement learning for few-shot imitation, in ICML, 2021. C. R. Dance, J. Perez, and T. Cachet. paper
  12. Hierarchical few-shot imitation with skill transition models, in ICLR, 2022. K. Hakhamaneshi, R. Zhao, A. Zhan, P. Abbeel, and M. Laskin. paper

Natural Language Processing

  1. High-risk learning: Acquiring new word vectors from tiny data, in EMNLP, 2017. A. Herbelot and M. Baroni. paper
  2. MetaEXP: Interactive explanation and exploration of large knowledge graphs, in TheWebConf, 2018. F. Behrens, S. Bischoff, P. Ladenburger, J. Rückin, L. Seidel, F. Stolp, M. Vaichenker, A. Ziegler, D. Mottin, F. Aghaei, E. Müller, M. Preusse, N. Müller, and M. Hunger. paper code
  3. Few-shot representation learning for out-of-vocabulary words, in ACL, 2019. Z. Hu, T. Chen, K.-W. Chang, and Y. Sun. paper
  4. Learning to customize model structures for few-shot dialogue generation tasks, in ACL, 2020. Y. Song, Z. Liu, W. Bi, R. Yan, and M. Zhang. paper
  5. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network, in ACL, 2020. Y. Hou, W. Che, Y. Lai, Z. Zhou, Y. Liu, H. Liu, and T. Liu. paper
  6. Meta-reinforced multi-domain state generator for dialogue systems, in ACL, 2020. Y. Huang, J. Feng, M. Hu, X. Wu, X. Du, and S. Ma. paper
  7. Few-shot knowledge graph completion, in AAAI, 2020. C. Zhang, H. Yao, C. Huang, M. Jiang, Z. Li, and N. V. Chawla. paper
  8. Universal natural language processing with limited annotations: Try few-shot textual entailment as a start, in EMNLP, 2020. W. Yin, N. F. Rajani, D. Radev, R. Socher, and C. Xiong. paper code
  9. Simple and effective few-shot named entity recognition with structured nearest neighbor learning, in EMNLP, 2020. Y. Yang, and A. Katiyar. paper code
  10. Discriminative nearest neighbor few-shot intent detection by transferring natural language inference, in EMNLP, 2020. J. Zhang, K. Hashimoto, W. Liu, C. Wu, Y. Wan, P. Yu, R. Socher, and C. Xiong. paper code
  11. Few-shot learning for opinion summarization, in EMNLP, 2020. A. Bražinskas, M. Lapata, and I. Titov. paper code
  12. Adaptive attentional network for few-shot knowledge graph completion, in EMNLP, 2020. J. Sheng, S. Guo, Z. Chen, J. Yue, L. Wang, T. Liu, and H. Xu. paper code
  13. Few-shot complex knowledge base question answering via meta reinforcement learning, in EMNLP, 2020. Y. Hua, Y. Li, G. Haffari, G. Qi, and T. Wu. paper code
  14. Self-supervised meta-learning for few-shot natural language classification tasks, in EMNLP, 2020. T. Bansal, R. Jha, T. Munkhdalai, and A. McCallum. paper code
  15. Uncertainty-aware self-training for few-shot text classification, in NeurIPS, 2020. S. Mukherjee, and A. Awadallah. paper code
  16. Learning to extrapolate knowledge: Transductive few-shot out-of-graph link prediction, in NeurIPS, 2020:. J. Baek, D. B. Lee, and S. J. Hwang. paper code
  17. MetaNER: Named entity recognition with meta-learning, in TheWebConf, 2020. J. Li, S. Shang, and L. Shao. paper
  18. Conditionally adaptive multi-task learning: Improving transfer learning in NLP using fewer parameters & less data, in ICLR, 2021. J. Pilault, A. E. hattami, and C. Pal. paper code
  19. Revisiting few-sample BERT fine-tuning, in ICLR, 2021. T. Zhang, F. Wu, A. Katiyar, K. Q. Weinberger, and Y. Artzi. paper code
  20. Few-shot conversational dense retrieval, in SIGIR, 2021. S. Yu, Z. Liu, C. Xiong, T. Feng, and Z. Liu. paper code
  21. Relational learning with gated and attentive neighbor aggregator for few-shot knowledge graph completion, in SIGIR, 2021. G. Niu, Y. Li, C. Tang, R. Geng, J. Dai, Q. Liu, H. Wang, J. Sun, F. Huang, and L. Si. paper
  22. Few-shot language coordination by modeling theory of mind, in ICML, 2021. H. Zhu, G. Neubig, and Y. Bisk. paper code
  23. Graph-evolving meta-learning for low-resource medical dialogue generation, in AAAI, 2021. S. Lin, P. Zhou, X. Liang, J. Tang, R. Zhao, Z. Chen, and L. Lin. paper
  24. KEML: A knowledge-enriched meta-learning framework for lexical relation classification, in AAAI, 2021. C. Wang, M. Qiu, J. Huang, and X. He. paper
  25. Few-shot learning for multi-label intent detection, in AAAI, 2021. Y. Hou, Y. Lai, Y. Wu, W. Che, and T. Liu. paper code
  26. SALNet: Semi-supervised few-shot text classification with attention-based lexicon construction, in AAAI, 2021. J.-H. Lee, S.-K. Ko, and Y.-S. Han. paper
  27. Learning from my friends: Few-shot personalized conversation systems via social networks, in AAAI, 2021. Z. Tian, W. Bi, Z. Zhang, D. Lee, Y. Song, and N. L. Zhang. paper code
  28. Relative and absolute location embedding for few-shot node classification on graph, in AAAI, 2021. Z. Liu, Y. Fang, C. Liu, and S. C.H. Hoi. paper
  29. Few-shot question answering by pretraining span selection, in ACL-IJCNLP, 2021. O. Ram, Y. Kirstain, J. Berant, A. Globerson, and O. Levy. paper code
  30. A closer look at few-shot crosslingual transfer: The choice of shots matters, in ACL-IJCNLP, 2021. M. Zhao, Y. Zhu, E. Shareghi, I. Vulic, R. Reichart, A. Korhonen, and H. Schütze. paper code
  31. Learning from miscellaneous other-classwords for few-shot named entity recognition, in ACL-IJCNLP, 2021. M. Tong, S. Wang, B. Xu, Y. Cao, M. Liu, L. Hou, and J. Li. paper code
  32. Distinct label representations for few-shot text classification, in ACL-IJCNLP, 2021. S. Ohashi, J. Takayama, T. Kajiwara, and Y. Arase. paper code
  33. Entity concept-enhanced few-shot relation extraction, in ACL-IJCNLP, 2021. S. Yang, Y. Zhang, G. Niu, Q. Zhao, and S. Pu. paper code
  34. On training instance selection for few-shot neural text generation, in ACL-IJCNLP, 2021. E. Chang, X. Shen, H.-S. Yeh, and V. Demberg. paper code
  35. Unsupervised neural machine translation for low-resource domains via meta-learning, in ACL-IJCNLP, 2021. C. Park, Y. Tae, T. Kim, S. Yang, M. A. Khan, L. Park, and J. Choo. paper code
  36. Meta-learning with variational semantic memory for word sense disambiguation, in ACL-IJCNLP, 2021. Y. Du, N. Holla, X. Zhen, C. Snoek, and E. Shutova. paper code
  37. Multi-label few-shot learning for aspect category detection, in ACL-IJCNLP, 2021. M. Hu, S. Z. H. Guo, C. Xue, H. Gao, T. Gao, R. Cheng, and Z. Su. paper
  38. TextSETTR: Few-shot text style extraction and tunable targeted restyling, in ACL-IJCNLP, 2021. P. Rileya, N. Constantb, M. Guob, G. Kumarc, D. Uthusb, and Z. Parekh. paper
  39. Few-shot text ranking with meta adapted synthetic weak supervision, in ACL-IJCNLP, 2021. S. Sun, Y. Qian, Z. Liu, C. Xiong, K. Zhang, J. Bao, Z. Liu, and P. Bennett. paper code
  40. PROTAUGMENT: Intent detection meta-learning through unsupervised diverse paraphrasing, in ACL-IJCNLP, 2021. T. Dopierre, C. Gravier, and W. Logerais. paper code
  41. AUGNLG: Few-shot natural language generation using self-trained data augmentation, in ACL-IJCNLP, 2021. X. Xu, G. Wang, Y.-B. Kim, and S. Lee. paper code
  42. Meta self-training for few-shot neural sequence labeling, in KDD, 2021. Y. Wang, S. Mukherjee, H. Chu, Y. Tu, M. Wu, J. Gao, and A. H. Awadallah. paper code
  43. Knowledge-enhanced domain adaptation in few-shot relation classification, in KDD, 2021. J. Zhang, J. Zhu, Y. Yang, W. Shi, C. Zhang, and H. Wang. paper code
  44. Few-shot text classification with triplet networks, data augmentation, and curriculum learning, in NAACL-HLT, 2021. J. Wei, C. Huang, S. Vosoughi, Y. Cheng, and S. Xu. paper code
  45. Few-shot intent classification and slot filling with retrieved examples, in NAACL-HLT, 2021. D. Yu, L. He, Y. Zhang, X. Du, P. Pasupat, and Q. Li. paper
  46. Non-parametric few-shot learning for word sense disambiguation, in NAACL-HLT, 2021. H. Chen, M. Xia, and D. Chen. paper code
  47. Towards few-shot fact-checking via perplexity, in NAACL-HLT, 2021. N. Lee, Y. Bang, A. Madotto, and P. Fung. paper
  48. ConVEx: Data-efficient and few-shot slot labeling, in NAACL-HLT, 2021. M. Henderson, and I. Vulic. paper
  49. Few-shot text generation with natural language instructions, in EMNLP, 2021. T. Schick, and H. Schütze. paper
  50. Towards realistic few-shot relation extraction, in EMNLP, 2021. S. Brody, S. Wu, and A. Benton. paper code
  51. Few-shot emotion recognition in conversation with sequential prototypical networks, in EMNLP, 2021. G. Guibon, M. Labeau, H. Flamein, L. Lefeuvre, and C. Clavel. paper code
  52. Learning prototype representations across few-shot tasks for event detection, in EMNLP, 2021. V. Lai, F. Dernoncourt, and T. H. Nguyen. paper
  53. Exploring task difficulty for few-shot relation extraction, in EMNLP, 2021. J. Han, B. Cheng, and W. Lu. paper code
  54. Honey or poison? Solving the trigger curse in few-shot event detection via causal intervention, in EMNLP, 2021. J. Chen, H. Lin, X. Han, and L. Sun. paper code
  55. Nearest neighbour few-shot learning for cross-lingual classification, in EMNLP, 2021. M. S. Bari, B. Haider, and S. Mansour. paper
  56. Knowledge-aware meta-learning for low-resource text classification, in EMNLP, 2021. H. Yao, Y. Wu, M. Al-Shedivat, and E. P. Xing. paper code
  57. Few-shot named entity recognition: An empirical baseline study, in EMNLP, 2021. J. Huang, C. Li, K. Subudhi, D. Jose, S. Balakrishnan, W. Chen, B. Peng, J. Gao, and J. Han. paper
  58. MetaTS: Meta teacher-student network for multilingual sequence labeling with minimal supervision, in EMNLP, 2021. Z. Li, D. Zhang, T. Cao, Y. Wei, Y. Song, and B. Yin. paper
  59. Meta-LMTC: Meta-learning for large-scale multi-label text classification, in EMNLP, 2021. R. Wang, X. Su, S. Long, X. Dai, S. Huang, and J. Chen. paper
  60. Ontology-enhanced prompt-tuning for few-shot learning., in TheWebConf, 2022. H. Ye, N. Zhang, S. Deng, X. Chen, H. Chen, F. Xiong, X. Chen, and H. Chen. paper
  61. EICO: Improving few-shot text classification via explicit and implicit consistency regularization, in Findings of ACL, 2022. L. Zhao, and C. Yao. paper
  62. Dialogue summaries as dialogue states (DS2), template-guided summarization for few-shot dialogue state tracking, in Findings of ACL, 2022. J. Shin, H. Yu, H. Moon, A. Madotto, and J. Park. paper code
  63. A few-shot semantic parser for wizard-of-oz dialogues with the precise thingtalk representation, in Findings of ACL, 2022. G. Campagna, S. J. Semnani, R. Kearns, L. J. K. Sato, S. Xu, and M. S. Lam. paper
  64. Multi-stage prompting for knowledgeable dialogue generation, in Findings of ACL, 2022. Z. Liu, M. Patwary, R. Prenger, S. Prabhumoye, W. Ping, M. Shoeybi, and B. Catanzaro. paper code
  65. Few-shot named entity recognition with self-describing networks, in ACL, 2022. J. Chen, Q. Liu, H. Lin, X. Han, and L. Sun. paper code
  66. CLIP models are few-shot learners: Empirical studies on VQA and visual entailment, in ACL, 2022. H. Song, L. Dong, W. Zhang, T. Liu, and F. Wei. paper
  67. CONTaiNER: Few-shot named entity recognition via contrastive learning, in ACL, 2022. S. S. S. Das, A. Katiyar, R. J. Passonneau, and R. Zhang. paper code
  68. Few-shot controllable style transfer for low-resource multilingual settings, in ACL, 2022. K. Krishna, D. Nathani, X. Garcia, B. Samanta, and P. Talukdar. paper
  69. Label semantic aware pre-training for few-shot text classification, in ACL, 2022. A. Mueller, J. Krone, S. Romeo, S. Mansour, E. Mansimov, Y. Zhang, and D. Roth. paper
  70. Inverse is better! Fast and accurate prompt for few-shot slot tagging, in Findings of ACL, 2022. Y. Hou, C. Chen, X. Luo, B. Li, and W. Che. paper
  71. Label semantics for few shot named entity recognition, in Findings of ACL, 2022. J. Ma, M. Ballesteros, S. Doss, R. Anubhai, S. Mallya, Y. Al-Onaizan, and D. Roth. paper
  72. Hierarchical recurrent aggregative generation for few-shot NLG, in Findings of ACL, 2022. G. Zhou, G. Lampouras, and I. Iacobacci. paper
  73. Towards few-shot entity recognition in document images: A label-aware sequence-to-sequence framework, in Findings of ACL, 2022. Z. Wang, and J. Shang. paper
  74. A good prompt is worth millions of parameters: Low-resource prompt-based learning for vision-language models, in ACL, 2022. W. Jin, Y. Cheng, Y. Shen, W. Chen, and X. Ren. paper code
  75. Generated knowledge prompting for commonsense reasoning, in ACL, 2022. J. Liu, A. Liu, X. Lu, S. Welleck, P. West, R. L. Bras, Y. Choi, and H. Hajishirzi. paper code
  76. End-to-end modeling via information tree for one-shot natural language spatial video grounding, in ACL, 2022. M. Li, T. Wang, H. Zhang, S. Zhang, Z. Zhao, J. Miao, W. Zhang, W. Tan, J. Wang, P. Wang, S. Pu, and F. Wu. paper
  77. Leveraging task transferability to meta-learning for clinical section classification with limited data, in ACL, 2022. Z. Chen, J. Kim, R. Bhakta, and M. Y. Sir. paper
  78. Improving meta-learning for low-resource text classification and generation via memory imitation, in ACL, 2022. Y. Zhao, Z. Tian, H. Yao, Y. Zheng, D. Lee, Y. Song, J. Sun, and N. L. Zhang. paper
  79. A simple yet effective relation information guided approach for few-shot relation extraction, in Findings of ACL, 2022. Y. Liu, J. Hu, X. Wan, and T. Chang. paper code
  80. Decomposed meta-learning for few-shot named entity recognition, in Findings of ACL, 2022. T. Ma, H. Jiang, Q. Wu, T. Zhao, and C. Lin. paper code
  81. Meta-learning for fast cross-lingual adaptation in dependency parsing, in ACL, 2022. A. Langedijk, V. Dankers, P. Lippe, S. Bos, B. C. Guevara, H. Yannakoudakis, and E. Shutova. paper code
  82. Enhancing cross-lingual natural language inference by prompt-learning from cross-lingual templates, in ACL, 2022. K. Qi, H. Wan, J. Du, and H. Chen. paper code

Acoustic Signal Processing

  1. One-shot learning of generative speech concepts, in CogSci, 2014. B. Lake, C.-Y. Lee, J. Glass, and J. Tenenbaum. paper
  2. Machine speech chain with one-shot speaker adaptation, INTERSPEECH, 2018. A. Tjandra, S. Sakti, and S. Nakamura. paper
  3. Investigation of using disentangled and interpretable representations for one-shot cross-lingual voice conversion, INTERSPEECH, 2018. S. H. Mohammadi and T. Kim. paper
  4. Few-shot audio classification with attentional graph neural networks, INTERSPEECH, 2019. S. Zhang, Y. Qin, K. Sun, and Y. Lin. paper
  5. One-shot voice conversion with disentangled representations by leveraging phonetic posteriorgrams, INTERSPEECH, 2019. S. H. Mohammadi, and T. Kim. paper
  6. One-shot voice conversion with global speaker embeddings, INTERSPEECH, 2019. H. Lu, Z. Wu, D. Dai, R. Li, S. Kang, J. Jia, and H. Meng. paper
  7. One-shot voice conversion by separating speaker and content representations with instance normalization, INTERSPEECH, 2019. J.-C. Chou, and H.-Y. Lee. paper
  8. Audio2Head: Audio-driven one-shot talking-head generation with natural head motion, in IJCAI, 2021. S. Wang, L. Li, Y. Ding, C. Fan, and X. Yu. paper

Recommendation

  1. A meta-learning perspective on cold-start recommendations for items, in NeurIPS, 2017. M. Vartak, A. Thiagarajan, C. Miranda, J. Bratman, and H. Larochelle. paper
  2. MeLU: Meta-learned user preference estimator for cold-start recommendation, in KDD, 2019. H. Lee, J. Im, S. Jang, H. Cho, and S. Chung. paper code
  3. Sequential scenario-specific meta learner for online recommendation, in KDD, 2019. Z. Du, X. Wang, H. Yang, J. Zhou, and J. Tang. paper code
  4. Few-shot learning for new user recommendation in location-based social networks, in TheWebConf, 2020. R. Li, X. Wu, X. Chen, and W. Wang. paper
  5. MAMO: Memory-augmented meta-optimization for cold-start recommendation, in KDD, 2020. M. Dong, F. Yuan, L. Yao, X. Xu, and L. Zhu. paper code
  6. Meta-learning on heterogeneous information networks for cold-start recommendation, in KDD, 2020. Y. Lu, Y. Fang, and C. Shi. paper code
  7. MetaSelector: Meta-learning for recommendation with user-level adaptive model selection, in TheWebConf, 2020. M. Luo, F. Chen, P. Cheng, Z. Dong, X. He, J. Feng, and Z. Li. paper
  8. Fast adaptation for cold-start collaborative filtering with meta-learning, in ICDM, 2020. T. Wei, Z. Wu, R. Li, Z. Hu, F. Feng, X. H. Sun, and W. Wang. paper
  9. Preference-adaptive meta-learning for cold-start recommendation, in IJCAI, 2021. L. Wang, B. Jin, Z. Huang, H. Zhao, D. Lian, Q. Liu, and E. Chen. paper
  10. Meta-learning helps personalized product search., in TheWebConf, 2022. B. Wu, Z. Meng, Q. Zhang, and S. Liang. paper
  11. Alleviating cold-start problem in CTR prediction with a variational embedding learning framework., in TheWebConf, 2022. X. Xu, C. Yang, Q. Yu, Z. Fang, J. Wang, C. Fan, Y. He, C. Peng, Z. Lin, and J. Shao. paper
  12. PNMTA: A pretrained network modulation and task adaptation approach for user cold-start recommendation., in TheWebConf, 2022. H. Pang, F. Giunchiglia, X. Li, R. Guan, and X. Feng. paper

Others

  1. Low data drug discovery with one-shot learning, ACS Central Science, 2017. H. Altae-Tran, B. Ramsundar, A. S. Pappu, and V. Pande. paper
  2. SMASH: One-shot model architecture search through hypernetworks, in ICLR, 2018. A. Brock, T. Lim, J. Ritchie, and N. Weston. paper
  3. SPARC: Self-paced network representation for few-shot rare category characterization, in KDD, 2018. D. Zhou, J. He, H. Yang, and W. Fan. paper
  4. MetaPred: Meta-learning for clinical risk prediction with limited patient electronic health records, in KDD, 2019. X. S. Zhang, F. Tang, H. H. Dodge, J. Zhou, and F. Wang. paper code
  5. AffnityNet: Semi-supervised few-shot learning for disease type prediction, in AAAI, 2019. T. Ma, and A. Zhang. paper
  6. Learning from multiple cities: A meta-learning approach for spatial-temporal prediction, in TheWebConf, 2019. H. Yao, Y. Liu, Y. Wei, X. Tang, and Z. Li. paper code
  7. Federated meta-learning for fraudulent credit card detection, in IJCAI, 2020. W. Zheng, L. Yan, C. Gou, and F. Wang. paper
  8. Differentially private meta-learning, in ICLR, 2020. J. Li, M. Khodak, S. Caldas, and A. Talwalkar. paper
  9. Towards fast adaptation of neural architectures with meta learning, in ICLR, 2020. D. Lian, Y. Zheng, Y. Xu, Y. Lu, L. Lin, P. Zhao, J. Huang, and S. Gao. paper
  10. Using optimal embeddings to learn new intents with few examples: An application in the insurance domain, in KDD, 2020:. S. Acharya, and G. Fung. paper
  11. Meta-learning for query conceptualization at web scale, in KDD, 2020. F. X. Han, D. Niu, H. Chen, W. Guo, S. Yan, and B. Long. paper
  12. Few-sample and adversarial representation learning for continual stream mining, in TheWebConf, 2020. Z. Wang, Y. Wang, Y. Lin, E. Delord, and L. Khan. paper
  13. Few-shot graph learning for molecular property prediction, in TheWebConf, 2021. Z. Guo, C. Zhang, W. Yu, J. Herr, O. Wiest, M. Jiang, and N. V. Chawla. paper code
  14. Taxonomy-aware learning for few-shot event detection, in TheWebConf, 2021. J. Zheng, F. Cai, W. Chen, W. Lei, and H. Chen. paper
  15. Learning from graph propagation via ordinal distillation for one-shot automated essay scoring, in TheWebConf, 2021. Z. Jiang, M. Liu, Y. Yin, H. Yu, Z. Cheng, and Q. Gu. paper
  16. Few-shot network anomaly detection via cross-network meta-learning, in TheWebConf, 2021. K. Ding, Q. Zhou, H. Tong, and H. Liu. paper
  17. Few-shot knowledge validation using rules, in TheWebConf, 2021. M. Loster, D. Mottin, P. Papotti, J. Ehmüller, B. Feldmann, and F. Naumann. paper
  18. Graph learning regularization and transfer learning for few-shot event detection, in SIGIR, 2021. V. D. Lai, M. V. Nguyen, T. H. Nguyen, and F. Dernoncourt. paper code
  19. Progressive network grafting for few-shot knowledge distillation, in AAAI, 2021. C. Shen, X. Wang, Y. Yin, J. Song, S. Luo, and M. Song. paper code
  20. Curriculum meta-learning for next POI recommendation, in KDD, 2021. Y. Chen, X. Wang, M. Fan, J. Huang, S. Yang, and W. Zhu. paper code
  21. MFNP: A meta-optimized model for few-shot next POI recommendation, in IJCAI, 2021. H. Sun, J. Xu, K. Zheng, P. Zhao, P. Chao, and X. Zhou. paper
  22. Physics-aware spatiotemporal modules with auxiliary tasks for meta-learning, in IJCAI, 2021. S. Seo, C. Meng, S. Rambhatla, and Y. Liu. paper
  23. Property-aware relation networks for few-shot molecular property prediction, in NeurIPS, 2021. Y. Wang, A. Abuduweili, Q. Yao, and D. Dou. paper code
  24. Few-shot data-driven algorithms for low rank approximation, in NeurIPS, 2021. P. Indyk, T. Wagner, and D. Woodruff. paper
  25. Non-Gaussian Gaussian processes for few-shot regression, in NeurIPS, 2021. M. Sendera, J. Tabor, A. Nowak, A. Bedychaj, M. Patacchiola, T. Trzcinski, P. Spurek, and M. Zieba. paper
  26. HELP: Hardware-adaptive efficient latency prediction for NAS via meta-learning, in NeurIPS, 2021. H. Lee, S. Lee, S. Chong, and S. J. Hwang. paper
  27. Learning to learn dense Gaussian processes for few-shot learning, in NeurIPS, 2021. Z. Wang, Z. Miao, X. Zhen, and Q. Qiu. paper
  28. A meta-learning based stress category detection framework on social media., in TheWebConf, 2022. X. Wang, L. Cao, H. Zhang, L. Feng, Y. Ding, and N. Li. paper

Theories

  1. Learning to learn around a common mean, in NeurIPS, 2018. G. Denevi, C. Ciliberto, D. Stamos, and M. Pontil. paper
  2. Meta-learning and universality: Deep representations and gradient descent can approximate any learning algorithm, in ICLR, 2018. C. Finn and S. Levine. paper
  3. A theoretical analysis of the number of shots in few-shot learning, in ICLR, 2020. T. Cao, M. T. Law, and S. Fidler. paper
  4. Rapid learning or feature reuse? Towards understanding the effectiveness of MAML, in ICLR, 2020. A. Raghu, M. Raghu, S. Bengio, and O. Vinyals. paper
  5. Robust meta-learning for mixed linear regression with small batches, in NeurIPS, 2020. W. Kong, R. Somani, S. Kakade, and S. Oh. paper
  6. One-shot distributed ridge regression in high dimensions, in ICML, 2020. Y. Sheng, and E. Dobriban. paper
  7. Bridging the gap between practice and PAC-Bayes theory in few-shot meta-learning, in NeurIPS, 2021. N. Ding, X. Chen, T. Levinboim, S. Goodman, and R. Soricut. paper
  8. Generalization bounds for meta-learning: An information-theoretic analysis, in NeurIPS, 2021. Q. CHEN, C. Shui, and M. Marchand. paper
  9. Generalization bounds for meta-learning via PAC-Bayes and uniform stability, in NeurIPS, 2021. A. Farid, and A. Majumdar. paper
  10. Unraveling model-agnostic meta-learning via the adaptation learning rate, in ICLR, 2022. Y. Zou, F. Liu, and Q. Li. paper
  11. On the importance of firth bias reduction in few-shot classification, in ICLR, 2022. S. Ghaffari, E. Saleh, D. Forsyth, and Y. Wang. paper code
  12. Global convergence of MAML and theory-inspired neural architecture search for few-shot learning, in CVPR, 2022. H. Wang, Y. Wang, R. Sun, and B. Li. paper

Few-shot Learning and Zero-shot Learning

  1. Label-embedding for attribute-based classification, in CVPR, 2013. Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid. paper
  2. A unified semantic embedding: Relating taxonomies and attributes, in NeurIPS, 2014. S. J. Hwang and L. Sigal. paper
  3. Multi-attention network for one shot learning, in CVPR, 2017. P. Wang, L. Liu, C. Shen, Z. Huang, A. van den Hengel, and H. T. Shen. paper
  4. Few-shot and zero-shot multi-label learning for structured label spaces, in EMNLP, 2018. A. Rios and R. Kavuluru. paper
  5. Learning compositional representations for few-shot recognition, in ICCV, 2019. P. Tokmakov, Y.-X. Wang, and M. Hebert. paper code
  6. Large-scale few-shot learning: Knowledge transfer with class hierarchy, in CVPR, 2019. A. Li, T. Luo, Z. Lu, T. Xiang, and L. Wang. paper
  7. Generalized zero- and few-shot learning via aligned variational autoencoders, in CVPR, 2019. E. Schonfeld, S. Ebrahimi, S. Sinha, T. Darrell, and Z. Akata. paper code
  8. F-VAEGAN-D2: A feature generating framework for any-shot learning, in CVPR, 2019. Y. Xian, S. Sharma, B. Schiele, and Z. Akata. paper
  9. TGG: Transferable graph generation for zero-shot and few-shot learning, in ACM MM, 2019. C. Zhang, X. Lyu, and Z. Tang. paper
  10. Adaptive cross-modal few-shot learning, in NeurIPS, 2019. C. Xing, N. Rostamzadeh, B. N. Oreshkin, and P. O. Pinheiro. paper
  11. Learning meta model for zero- and few-shot face anti-spoofing, in AAAI, 2020. Y. Qin, C. Zhao, X. Zhu, Z. Wang, Z. Yu, T. Fu, F. Zhou, J. Shi, and Z. Lei. paper
  12. RD-GAN: Few/Zero-shot chinese character style transfer via radical decomposition and rendering, in ECCV, 2020. Y. Huang, M. He, L. Jin, and Y. Wang. paper
  13. An empirical study on large-scale multi-label text classification including few and zero-shot labels, in EMNLP, 2020. I. Chalkidis, M. Fergadiotis, S. Kotitsas, P. Malakasiotis, N. Aletras, and I. Androutsopoulos. paper
  14. Multi-label few/zero-shot learning with knowledge aggregated from multiple label graphs, in EMNLP, 2020. J. Lu, L. Du, M. Liu, and J. Dipnall. paper
  15. Emergent complexity and zero-shot transfer via unsupervised environment design, in NeurIPS, 2020. M. Dennis, N. Jaques, E. Vinitsky, A. Bayen, S. Russell, A. Critch, and S. Levine. paper
  16. Learning graphs for knowledge transfer with limited labels, in CVPR, 2021. P. Ghosh, N. Saini, L. S. Davis, and A. Shrivastava. paper
  17. Improving zero and few-shot abstractive summarization with intermediate fine-tuning and data augmentation, in NAACL-HLT, 2021. A. R. Fabbri, S. Han, H. Li, H. Li, M. Ghazvininejad, S. R. Joty, D. R. Radev, and Y. Mehdad. paper
  18. Label verbalization and entailment for effective zero and few-shot relation extraction, in EMNLP, 2021. O. Sainz, O. L. d. Lacalle, G. Labaka, A. Barrena, and E. Agirre. paper code
  19. An empirical investigation of word alignment supervision for zero-shot multilingual neural machine translation, in EMNLP, 2021. A. Raganato, R. Vázquez, M. Creutz, and J. Tiedemann. paper
  20. Bridge to target domain by prototypical contrastive learning and label confusion: Re-explore zero-shot learning for slot filling, in EMNLP, 2021. L. Wang, X. Li, J. Liu, K. He, Y. Yan, and W. Xu. paper code
  21. A label-aware BERT attention network for zero-shot multi-intent detection in spoken language understanding, in EMNLP, 2021. T. Wu, R. Su, and B. Juang. paper
  22. Zero-shot dialogue disentanglement by self-supervised entangled response selection, in EMNLP, 2021. T. Chi, and A. I. Rudnicky. paper code
  23. Robust retrieval augmented generation for zero-shot slot filling, in EMNLP, 2021. M. R. Glass, G. Rossiello, M. F. M. Chowdhury, and A. Gliozzo. paper code
  24. Everything is all it takes: A multipronged strategy for zero-shot cross-lingual information extraction, in EMNLP, 2021. M. Yarmohammadi, S. Wu, M. Marone, H. Xu, S. Ebner, G. Qin, Y. Chen, J. Guo, C. Harman, K. Murray, A. S. White, M. Dredze, and B. V. Durme. paper code
  25. An empirical study on multiple information sources for zero-shot fine-grained entity typing, in EMNLP, 2021. Y. Chen, H. Jiang, L. Liu, S. Shi, C. Fan, M. Yang, and R. Xu. paper
  26. Zero-shot dialogue state tracking via cross-task transfer, in EMNLP, 2021. Z. Lin, B. Liu, A. Madotto, S. Moon, Z. Zhou, P. Crook, Z. Wang, Z. Yu, E. Cho, R. Subba, and P. Fung. paper code
  27. Finetuned language models are zero-shot learners, in ICLR, 2022. J. Wei, M. Bosma, V. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le. paper code
  28. Zero-shot stance detection via contrastive learning., in TheWebConf, 2022. B. Liang, Z. Chen, L. Gui, Y. He, M. Yang, and R. Xu. paper code
  29. Reframing instructional prompts to GPTk’s language, in Findings of ACL, 2022. D. Khashabi, C. Baral, Y. Choi, and H. Hajishirzi. paper
  30. JointCL: A joint contrastive learning framework for zero-shot stance detection, in ACL, 2022. B. Liang, Q. Zhu, X. Li, M. Yang, L. Gui, Y. He, and R. Xu. paper code
  31. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification, in ACL, 2022. S. Hu, N. Ding, H. Wang, Z. Liu, J. Wang, J. Li, W. Wu, and M. Sun. paper code
  32. Uni-Perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks, in CVPR, 2022. X. Zhu, J. Zhu, H. Li, X. Wu, H. Li, X. Wang, and J. Dai. paper

Variants of Few-shot Learning

  1. Continuous adaptation via meta-learning in nonstationary and competitive environments, in ICLR, 2018. M. Al-Shedivat, T. Bansal, Y. Burda, I. Sutskever, I. Mordatch, and P. Abbeel. paper
  2. Deep online learning via meta-learning: Continual adaptation for model-based RL, in ICLR, 2018. A. Nagabandi, C. Finn, and S. Levine. paper
  3. Incremental few-shot learning with attention attractor networks, in NeurIPS, 2019. M. Ren, R. Liao, E. Fetaya, and R. S. Zemel. paper code
  4. Bidirectional one-shot unsupervised domain mapping, in ICCV, 2019. T. Cohen, and L. Wolf. paper
  5. XtarNet: Learning to extract task-adaptive representation for incremental few-shot learning, in ICML, 2020. S. W. Yoon, D. Kim, J. Seo, and J. Moon. paper code
  6. Few-shot class-incremental learning, in CVPR, 2020. X. Tao, X. Hong, X. Chang, S. Dong, X. Wei, and Y. Gong. paper
  7. Wandering within a world: Online contextualized few-shot learning, in ICLR, 2021. M. Ren, M. L. Iuzzolino, M. C. Mozer, and R. Zemel. paper
  8. Repurposing pretrained models for robust out-of-domain few-shot learning, in ICLR, 2021. N. Kwon, H. Na, G. Huang, and S. Lacoste-Julien. paper code
  9. Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation, in CVPR, 2021. X. Yue, Z. Zheng, S. Zhang, Y. Gao, T. Darrell, K. Keutzer, and A. S. Vincentelli. paper
  10. Self-promoted prototype refinement for few-shot class-incremental learning, in CVPR, 2021. K. Zhu, Y. Cao, W. Zhai, J. Cheng, and Z. Zha. paper
  11. Semantic-aware knowledge distillation for few-shot class-incremental learning, in CVPR, 2021. A. Cheraghian, S. Rahman, P. Fang, S. K. Roy, L. Petersson, and M. Harandi. paper
  12. Few-shot incremental learning with continually evolved classifiers, in CVPR, 2021. C. Zhang, N. Song, G. Lin, Y. Zheng, P. Pan, and Y. Xu. paper
  13. Learning a universal template for few-shot dataset generalization, in ICML, 2021. E. Triantafillou, H. Larochelle, R. Zemel, and V. Dumoulin. paper
  14. GP-Tree: A gaussian process classifier for few-shot incremental learning, in ICML, 2021. I. Achituve, A. Navon, Y. Yemini, G. Chechik, and E. Fetaya. paper code
  15. Addressing catastrophic forgetting in few-shot problems, in ICML, 2021. P. Yap, H. Ritter, and D. Barber. paper code
  16. Few-shot conformal prediction with auxiliary tasks, in ICML, 2021. A. Fisch, T. Schuster, T. Jaakkola, and R. Barzilay. paper code
  17. Few-shot lifelong learning, in AAAI, 2021. P. Mazumder, P. Singh, and P. Rai. paper
  18. Few-shot class-incremental learning via relation knowledge distillation, in AAAI, 2021. S. Dong, X. Hong, X. Tao, X. Chang, X. Wei, and Y. Gong. paper
  19. Few-shot one-class classification via meta-learning, in AAAI, 2021. A. Frikha, D. Krompass, H. Koepken, and V. Tresp. paper code
  20. Practical one-shot federated learning for cross-silo setting, in IJCAI, 2021. Q. Li, B. He, and D. Song. paper code
  21. Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system, in NAACL-HLT, 2021. C. Xia, W. Yin, Y. Feng, and P. S. Yu. paper
  22. Continual few-shot learning for text classification, in EMNLP, 2021. R. Pasunuru, V. Stoyanov, and M. Bansal. paper code
  23. Self-training with few-shot rationalization, in EMNLP, 2021. M. M. Bhat, A. Sordoni, and S. Mukherjee. paper
  24. Diverse distributions of self-supervised tasks for meta-learning in NLP, in EMNLP, 2021. T. Bansal, K. P. Gunasekaran, T. Wang, T. Munkhdalai, and A. McCallum. paper
  25. Generalized and incremental few-shot learning by explicit learning and calibration without forgetting, in ICCV, 2021. A. Kukleva, H. Kuehne, and B. Schiele. paper
  26. Meta learning on a sequence of imbalanced domains with difficulty awareness, in ICCV, 2021. Z. Wang, T. Duan, L. Fang, Q. Suo, and M. Gao. paper code
  27. Synthesized feature based few-shot class-incremental learning on a mixture of subspaces, in ICCV, 2021. A. Cheraghian, S. Rahman, S. Ramasinghe, P. Fang, C. Simon, L. Petersson, and M. Harandi. paper
  28. Few-shot and continual learning with attentive independent mechanisms, in ICCV, 2021. E. Lee, C. Huang, and C. Lee. paper code
  29. Low-shot validation: Active importance sampling for estimating classifier performance on rare categories, in ICCV, 2021. F. Poms, V. Sarukkai, R. T. Mullapudi, N. S. Sohoni, W. R. Mark, D. Ramanan, and K. Fatahalian. paper
  30. Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima, in NeurIPS, 2021. G. SHI, J. CHEN, W. Zhang, L. Zhan, and X. Wu. paper
  31. Variational continual Bayesian meta-learning, in NeurIPS, 2021. Q. Zhang, J. Fang, Z. Meng, S. Liang, and E. Yilmaz. paper
  32. LFPT5: A unified framework for lifelong few-shot language learning based on prompt tuning of T5, in ICLR, 2022. C. Qin, and S. Joty. paper code
  33. Subspace regularizers for few-shot class incremental learning, in ICLR, 2022. A. F. Akyürek, E. Akyürek, D. Wijaya, and J. Andreas. paper code
  34. Meta discovery: Learning to discover novel classes given very limited data, in ICLR, 2022. H. Chi, F. Liu, W. Yang, L. Lan, T. Liu, B. Han, G. Niu, M. Zhou, and M. Sugiyama. paper
  35. Topological transduction for hybrid few-shot learning., in TheWebConf, 2022. J. Chen, and A. Zhang. paper
  36. Continual few-shot relation learning via embedding space regularization and data augmentation, in ACL, 2022. C. Qin, and S. Joty. paper code
  37. Few-shot class-incremental learning for named entity recognition, in ACL, 2022. R. Wang, T. Yu, H. Zhao, S. Kim, S. Mitra, R. Zhang, and R. Henao. paper
  38. Task-adaptive negative envision for few-shot open-set recognition, in CVPR, 2022. S. Huang, J. Ma, G. Han, and S. Chang. paper code
  39. Forward compatible few-shot class-incremental learning, in CVPR, 2022. D. Zhou, F. Wang, H. Ye, L. Ma, S. Pu, and D. Zhan. paper code
  40. Sylph: A hypernetwork framework for incremental few-shot object detection, in CVPR, 2022. L. Yin, J. M. Perez-Rua, and K. J. Liang. paper
  41. Constrained few-shot class-incremental learning, in CVPR, 2022. M. Hersche, G. Karunaratne, G. Cherubini, L. Benini, A. Sebastian, and A. Rahimi. paper
  42. iFS-RCNN: An incremental few-shot instance segmenter, in CVPR, 2022. K. Nguyen, and S. Todorovic. paper
  43. MetaFSCIL: A meta-learning approach for few-shot class incremental learning, in CVPR, 2022. Z. Chi, L. Gu, H. Liu, Y. Wang, Y. Yu, and J. Tang. paper
  44. Few-shot incremental learning for label-to-image translation, in CVPR, 2022. P. Chen, Y. Zhang, Z. Li, and L. Sun. paper
  45. Revisiting learnable affines for batch norm in few-shot transfer learning, in CVPR, 2022. M. Yazdanpanah, A. A. Rahman, M. Chaudhary, C. Desrosiers, M. Havaei, E. Belilovsky, and S. E. Kahou. paper
  46. Few-shot learning with noisy labels, in CVPR, 2022. K. J. Liang, S. B. Rangrej, V. Petrovic, and T. Hassner. paper
  47. Improving adversarially robust few-shot image classification with generalizable representations, in CVPR, 2022. J. Dong, Y. Wang, J. Lai, and X. Xie. paper

Datasets/Benchmarks

  1. FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation, in EMNLP, 2018. X. Han, H. Zhu, P. Yu, Z. Wang, Y. Yao, Z. Liu, and M. Sun. paper code
  2. Meta-World: A benchmark and evaluation for multi-task and meta reinforcement learning, arXiv preprint, 2019. T. Yu, D. Quillen, Z. He, R. Julian, K. Hausman, C. Finn, and S. Levine. paper code
  3. The Omniglot challenge: A 3-year progress report, in Current Opinion in Behavioral Sciences, 2019. B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum. paper code
  4. FewRel 2.0: Towards more challenging few-shot relation classification, in EMNLP-IJCNLP, 2019. T. Gao, X. Han, H. Zhu, Z. Liu, P. Li, M. Sun, and J. Zhou. paper code
  5. META-DATASET: A dataset of datasets for learning to learn from few examples, in ICLR, 2020. E. Triantafillou, T. Zhu, V. Dumoulin, P. Lamblin, U. Evci, K. Xu, R. Goroshin, C. Gelada, K. Swersky, P. Manzagol, and H. Larochelle. paper code
  6. Few-shot object detection with attention-rpn and multi-relation detector, in CVPR, 2020. Q. Fan, W. Zhuo, C.-K. Tang, Y.-W. Tai. paper code
  7. FSS-1000: A 1000-class dataset for few-shot segmentation, in CVPR, 2020. X. Li, T. Wei, Y. P. Chen, Y.-W. Tai, and C.-K. Tang. paper code
  8. Impact of base dataset design on few-shot image classification, in ECCV, 2020. O. Sbai, C. Couprie, and M. Aubry. paper code
  9. A large-scale benchmark for few-shot program induction and synthesis, in ICML, 2021. F. Alet, J. Lopez-Contreras, J. Koppel, M. Nye, A. Solar-Lezama, T. Lozano-Perez, L. Kaelbling, and J. Tenenbaum. paper code
  10. FEW-NERD: A few-shot named entity recognition dataset, in ACL-IJCNLP, 2021. N. Ding, G. Xu, Y. Chen, X. Wang, X. Han, P. Xie, H. Zheng, and Z. Liu. paper code
  11. CrossFit: A few-shot learning challenge for cross-task generalization in NLP, in EMNLP, 2021. Q. Ye, B. Y. Lin, and X. Ren. paper code
  12. ORBIT: A real-world few-shot dataset for teachable object recognition, in ICCV, 2021. D. Massiceti, L. Zintgraf, J. Bronskill, L. Theodorou, M. T. Harris, E. Cutrell, C. Morrison, K. Hofmann, and S. Stumpf. paper code
  13. FLEX: Unifying evaluation for few-shot NLP, in NeurIPS, 2021. J. Bragg, A. Cohan, K. Lo, and I. Beltagy. paper
  14. Two sides of meta-learning evaluation: In vs. out of distribution, in NeurIPS, 2021. A. Setlur, O. Li, and V. Smith. paper
  15. Realistic evaluation of transductive few-shot learning, in NeurIPS, 2021. O. Veilleux, M. Boudiaf, P. Piantanida, and I. B. Ayed. paper
  16. FewNLU: Benchmarking state-of-the-art methods for few-shot natural language understanding, in ACL, 2022. Y. Zheng, J. Zhou, Y. Qian, M. Ding, C. Liao, L. Jian, R. Salakhutdinov, J. Tang, S. Ruder, and Z. Yang. paper code
  17. Bongard-HOI: Benchmarking few-shot visual reasoning for human-object interactions, in CVPR, 2022. H. Jiang, X. Ma, W. Nie, Z. Yu, Y. Zhu, and A. Anandkumar. paper code

Software Library

  1. PaddleFSL, a library for few-shot learning written in PaddlePaddlelink
  2. Torchmeta, a library for few-shot learning & meta-learning written in PyTorchlink
  3. learn2learn, a library for meta-learning written in PyTorchlink
  4. keras-fsl, a library for few-shot learning written in Tensorflowlink

YOLOv7-Pose 基于YOLOv7的关键点模型

目前人体姿态估计总体分为Top-down和Bottom-up两种,与目标检测不同,无论是基于热力图或是基于检测器处理的关键点检测算法,都较为依赖计算资源,推理耗时略长,今年出现了以YOLO为基线的关键点检测器。玩过目标检测的童鞋都知道YOLO以及各种变种目前算是工业落地较多的一类检测器,其简单的设计思想,长期活跃的社区生态,使其始终占据着较高的话题度。

【演变】

在ECCV 2022和CVPRW 2022会议上,YoLo-Pose和KaPao(下称为yolo-like-pose)都基于流行的YOLO目标检测框架提出一种新颖的无热力图的方法,类似于很久以前谷歌使用回归计算关键点的思想,yolo-like-pose一不使用检测器进行二阶处理,二部使用热力图拼接,虽然是一种暴力回归关键点的检测算法,但在处理速度上具有一定优势。

kapao

去年11月,滑铁卢大学率先提出了 KaPao:Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation,基于YOLOv5进行关键点检测,该文章目前已被ECCV 2022接收,该算法所取得的性能如下:

paper:https://arxiv.org/abs/2111.08557

code:https://github.com/wmcnally/kapao

yolov5-pose

今年4月,yolo-pose也挂在了arvix,在论文中,通过调研发现 HeatMap 的方式普遍使用L1 Loss。然而,L1损失并不一定适合获得最佳的OKS。且由于HeatMap是概率图,因此在基于纯HeatMap的方法中不可能使用OKS作为loss,只有当回归到关键点位置时,OKS才能被用作损失函数。因此,yolo-pose使用oks loss作为关键点的损失

相关代码在https://github.com/TexasInstruments/edgeai-yolov5/blob/yolo-pose/utils/loss.py也可见到:

 if self.kpt_label:
                    #Direct kpt prediction
                    pkpt_x = ps[:, 6::3] * 2. – 0.5
                    pkpt_y = ps[:, 7::3] * 2. – 0.5
                    pkpt_score = ps[:, 8::3]
                    #mask
                    kpt_mask = (tkpt[i][:, 0::2] != 0)
                    lkptv += self.BCEcls(pkpt_score, kpt_mask.float()) 
                    #l2 distance based loss
                    #lkpt += (((pkpt-tkpt[i])*kpt_mask)**2).mean()  #Try to make this loss based on distance instead of ordinary difference
                    #oks based loss
                    d = (pkpt_x-tkpt[i][:,0::2])**2 + (pkpt_y-tkpt[i][:,1::2])**2
                    s = torch.prod(tbox[i][:,-2:], dim=1, keepdim=True)
                    kpt_loss_factor = (torch.sum(kpt_mask != 0) + torch.sum(kpt_mask == 0))/torch.sum(kpt_mask != 0)
                    lkpt += kpt_loss_factor*((1 – torch.exp(-d/(s*(4*sigmas**2)+1e-9)))*kpt_mask).mean()

yolov7-pose

上个星期,YOLOv7的作者也放出了关于人体关键点检测的模型,该模型基于YOLOv7-w6

目前作者提供了.pt文件和推理测试的脚本,有兴趣的童靴可以去看看,本文的重点更偏向于对yolov7-pose.pt进行onnx文件的抽取和推理。

【yolov7-pose + onnxruntime】

首先下载好官方的预训练模型,使用提供的脚本进行推理:

% weigths = torch.load('weights/yolov7-w6-pose.pt')
% image = cv2.imread('sample/pose.jpeg')
!python pose.py 

一、yolov7-w6 VS yolov7-w6-pose

首先看下yolov7-w6使用的检测头

二、修改export脚本

如果直接使用export脚本进行onnx的抽取一定报错,在上一节我们已经看到pose.pt模型使用的检测头为IKeypoint,那么脚本需要进行相应更改:在export.py的这个位置插入:

 # 原代码:
    for k, m in model.named_modules():
        m._non_persistent_buffers_set = set()  # pytorch 1.6.0 compatibility
        if isinstance(m, models.common.Conv):  # assign export-friendly activations
            if isinstance(m.act, nn.Hardswish):
                m.act = Hardswish()
            elif isinstance(m.act, nn.SiLU):
                m.act = SiLU()
     model.model[-1].export = not opt.grid  # set Detect() layer grid export
                
    # 修改代码:
    for k, m in model.named_modules():
        m._non_persistent_buffers_set = set()  # pytorch 1.6.0 compatibility
        if isinstance(m, models.common.Conv):  # assign export-friendly activations
            if isinstance(m.act, nn.Hardswish):
                m.act = Hardswish()
            elif isinstance(m.act, nn.SiLU):
                m.act = SiLU()
        elif isinstance(m, models.yolo.IKeypoint):
            m.forward = m.forward_keypoint  # assign forward (optional)
            # 此处切换检测头
    model.model[-1].export = not opt.grid  # set Detect() layer grid export

forward_keypoint在原始的yolov7 repo源码中有,作者已经封装好,但估计是还没打算开放使用。

使用以下命令进行抽取:python export.py –weights ‘weights/yolov7-w6-pose.pt’ –img-size 960 –simplify True

三、onnxruntime推理

onnxruntime推理代码:

import onnxruntime
import matplotlib.pyplot as plt
import torch
import cv2
from torchvision import transforms
import numpy as np
from utils.datasets import letterbox
from utils.general import non_max_suppression_kpt
from utils.plots import output_to_keypoint, plot_skeleton_kpts

device = torch.device("cpu")

image = cv2.imread('sample/pose.jpeg')
image = letterbox(image, 960, stride=64, auto=True)[0]
image_ = image.copy()
image = transforms.ToTensor()(image)
image = torch.tensor(np.array([image.numpy()]))

print(image.shape)
sess = onnxruntime.InferenceSession('weights/yolov7-w6-pose.onnx')
out = sess.run(['output'], {'images': image.numpy()})[0]
out = torch.from_numpy(out)

output = non_max_suppression_kpt(out, 0.25, 0.65, nc=1, nkpt=17, kpt_label=True)
output = output_to_keypoint(output)
nimg = image[0].permute(1, 2, 0) * 255
nimg = nimg.cpu().numpy().astype(np.uint8)
nimg = cv2.cvtColor(nimg, cv2.COLOR_RGB2BGR)
for idx in range(output.shape[0]):
    plot_skeleton_kpts(nimg, output[idx, 7:].T, 3)

# matplotlib inline
plt.figure(figsize=(8, 8))
plt.axis('off')
plt.imshow(nimg)
plt.show()
plt.savefig("tmp")

推理效果几乎无损,但耗时会缩短一倍左右,另外有几个点:

  • image = letterbox(image, 960, stride=64, auto=True)[0] 中stride指的是最大步长,yolov7-w6和yolov5s下采样多了一步,导致在8,16,32的基础上多了64的下采样步长
  • output = non_max_suppression_kpt(out, 0.25, 0.65, nc=1, nkpt=17, kpt_label=True) ,nc 和 kpt_label 等信息在netron打印模型文件时可以看到
  • 所得到的onnx相比原半精度模型大了将近三倍,后续排查原因
  • yolov7-w6-pose极度吃显存,推理一张960×960的图像,需要2-4G的显存,训练更难以想象