Current Issue Cover
代表特征网络的小样本学习方法

汪荣贵, 郑岩, 杨娟, 薛丽霞(合肥工业大学计算机与信息学院, 合肥 230601)

摘 要
目的 小样本学习任务旨在仅提供少量有标签样本的情况下完成对测试样本的正确分类。基于度量学习的小样本学习方法通过将样本映射到嵌入空间,计算距离得到相似性度量以预测类别,但未能从类内多个支持向量中归纳出具有代表性的特征以表征类概念,限制了分类准确率的进一步提高。针对该问题,本文提出代表特征网络,分类效果提升显著。方法 代表特征网络通过类代表特征的度量学习策略,利用类中支持向量集学习得到的代表特征有效地表达类概念,实现对测试样本的正确分类。具体地说,代表特征网络包含两个模块,首先通过嵌入模块提取抽象层次高的嵌入向量,然后堆叠嵌入向量经过代表特征模块得到各个类代表特征。随后通过计算测试样本嵌入向量与各类代表特征的距离以预测类别,最后使用提出的混合损失函数计算损失以拉大嵌入空间中相互类别间距减少相似类别错分情况。结果 经过广泛实验,在Omniglot、miniImageNet和Cifar100数据集上都验证了本文模型不仅可以获得目前已知最好的分类准确率,而且能够保持较高的训练效率。结论 代表特征网络可以从类中多个支持向量有效地归纳出代表特征用于对测试样本的分类,对比直接使用支持向量进行分类具有更好的鲁棒性,进一步提高了小样本条件下的分类准确率。
关键词
Representative feature networks for few-shot learning

Wang Ronggui, Zheng Yan, Yang Juan, Xue Lixia(School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China)

Abstract
Objective Few-shot learning aims to build a classifier that recognizes new unseen classes given only a few samples. The solutions are mainly in the following categories:data augmentation, meta-learning, and metric learning. Data augmentation can be used to reduce certain over-fitting given a limited data regime in a new class. The corresponding solution is to augment data in the feature domain as hallucinating features. These methods exert a certain effect on few-shot classification. However, due to the extremely small data space, the transformation mode is considerably limited and cannot solve over-fitting problems. The meta-learning method is suitable for few-shot learning because it is based on the high-level strategy of learning similar tasks. Some methods learn good initial values, some learn task-level update strategies, and others construct external memory storages to remember past information for comparison during testing. The few-shot classification results of these methods are superior, but the network structure is increasingly complicated due to the use of RNNs(recurrent neural networks). The efficiency is also low. The metric learning method is simple and efficient. It first maps a sample to the embedding space and then computes the distance to obtain the similarity metric to predict the category. Some approaches improve the representation of features in the embedding space, some use learnable distance metrics to compute distance for loss, and others combine meta-learning methods to improve accuracy. However, this type of method fails to summarize representative features from multiple support vectors in a class to effectively represent the class concept. This drawback limits the further improvement of the accuracy of small sample classification. To address this problem, this study proposes a representative feature network. Method The representative feature network is a metric learning strategy based on class representative features. It uses the representative features learned from a support vector set in a class to express the class concept effectively. It also uses mixture loss to reduce the misclassification of similar classes and thus achieve excellent classification results. Specifically, the representative feature network includes two modules. The embedded vector of a high abstraction level is extracted by the embedded module, and then the representative feature per class is obtained by the representative feature module by inputting stacked support vector sets. The class representative feature fully considers the influence of the embedded vector of the support samples on the basis of the target that may or may not be obvious. The use of network learning to assign different weights to each embedded support vector can effectively avoid misclassification caused by the bias effects of representative features for unobvious target samples. Then, the distances from the embedded query vectors to each class representative feature are calculated to predict the class. In addition, the mixture loss function is proposed for the misclassification of similar classes in the embedded space. The cross-entropy loss combined with the relative error loss function is used to increase the inter-class distances and reduce the similar class error rate. Result After extensive experiments, the Omniglot, miniImageNet, and Cifar100 datasets verify that the model achieves state-of-the-art results. For the simple Omniglot dataset, the five-way, five-shot classification accuracy is 99.7%, which is 1% higher than that of the original matching network. For the complex miniImageNet dataset, the five-way, five-shot classification accuracy is 75.83%, which is approximately 18% higher than that of the original matching network. Representative features provide approximately 8% improvement, indicating that it can effectively express the prototype by distinguishing the contribution of different support vectors, the target of which may or may not be obvious. Mixture loss provides approximately 1% improvement, indicating that it can reduce some misclassification of similar classes in the testing set. However, the improvement is unremarkable because similar samples are uncommon in the dataset. The last 9% improvement is due to the fine-tuning on the test set, indicating that the advantage of the skip connection method benefits loss propagation relative to the original connection between the network module methods. For the Cifar100 dataset, the five-way, five-shot classification accuracy is 87.99%, which is 20% higher than that of the original matching network. Moreover, the high training efficiency is maintained while the performance is significantly improved. Conclusion To address the problem of extremely simple original embedding networks for extracting high-level features of samples, the improved embedding networks in a representative feature network use a skip connection structure so as to deepen the network and extract advanced features. To address the problem of the noise support vector that disturbs the classification accuracy of a testing sample, the representative feature network can effectively summarize the representative features from multiple support vectors in a class for classifying query samples. Compared with the performance when support vectors are used directly, the classification performance when representative features are used is more robust, and the classification accuracy under few-shot samples is further improved. In addition, the mixture loss function proposed for the classification problem of similar classes is used to enlarge the distance between categories in the embedded space and reduce the misclassification of similar classes. Detailed experiments are carried out to verify that these improved methods achieve great performance in few-shot learning tasks for the Omniglot, miniImageNet, and Cifar100 datasets. At the same time, the representative feature network presents improvement. For embedding networks, advanced structures, such as dense connections or se modules, must be included in future work to further improve the results.
Keywords

订阅号|日报