元迁移学习在少样本跨域图像分类中的研究

杜彦东; 冯林; 陶鹏; 龚勋; 王俊

发布时间： 2023-09-20
摘要点击次数： 1198
全文下载次数： 624
DOI: 10.11834/jig.220664
2023 | Volume 28 | Number 9

元迁移学习在少样本跨域图像分类中的研究

杜彦东¹, 冯林¹, 陶鹏¹, 龚勋², 王俊³(1.四川师范大学计算机科学学院, 成都 610101;2.西南交通大学计算机与人工智能学院, 成都 610031;3.四川师范大学商学院, 成都 610101)

摘要

目的现有基于元学习的主流少样本学习方法假设训练任务和测试任务服从相同或相似的分布,然而在分布差异较大的跨域任务上,这些方法面临泛化能力弱、分类精度差等挑战。同时,基于迁移学习的少样本学习方法没有考虑到训练和测试阶段样本类别不一致的情况,在训练阶段未能留下足够的特征嵌入空间。为了提升模型在有限标注样本困境下的跨域图像分类能力,提出简洁的元迁移学习(compressed meta transfer learning,CMTL)方法。方法基于元学习,对目标域中的支持集使用数据增强策略,构建新的辅助任务微调元训练参数,促使分类模型更加适用于域差异较大的目标任务。基于迁移学习,使用自压缩损失函数训练分类模型,以压缩源域中基类数据所占据的特征嵌入空间,微调阶段引导与源域分布差异较大的新类数据有更合适的特征表示。最后,将以上两种策略的分类预测融合视为最终的分类结果。结果使用 mini-ImageNet 作为源域数据集进行训练,分别在 EuroSAT(European Satellite)、ISIC(International Skin Imaging Collaboration)、CropDiseas(Crop Diseases)和 Chest-X(ChestX-Ray)数据集上测试 CMTL 模型的跨域图像分类能力,在 5-way 1-shot 和 5-way 5-shot 跨域任务中,准确率分别达到68.87% 和 87.74%、34.47% 和 49.71%、74.92% 和 93.37%、22.22% 和 25.40%。与当前主流少样本跨域图像分类方法比较,提出的 CMTL 方法在各个数据集中都有较好的跨域图像分类能力。结论提出的 CMTL 方法综合了迁移学习和元学习方法各自在跨域任务上的优势,能有效解决少样本跨域图像分类问题。

关键词

图像分类少样本跨域元学习迁移学习少样本学习(FSL)

Meta-transfer learning in cross-domain image classification with few-shot learning

Du Yandong¹, Feng Lin¹, Tao Peng¹, Gong Xun², Wang Jun³(1.School of Computer Science, Sichuan Normal University, Chengdu 610101, China;2.School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031, China;3.School of Business, Sichuan Normal University, Chengdu 610101, China)

Abstract

Objective Few-shot learning image classification aims to recognize images with limited labeled samples.At present, few-shot learning methods are roughly divided into three categories:gradient optimization, metric learning, and transfer learning.Gradient optimization methods usually consist of two loop stages.In the inner loop stage, the base model quickly adapts to new tasks with limited labeled samples;in the outer loop stage, the meta-model learns cross-task knowledge for good generalization performance.The metric learning method first maps the samples to the high-dimensional embedding space.Then, the correct classification of the unknown samples is completed according to the similarity measure.The transfer learning method first pretrains a high-quality feature extractor with a large amount of annotated data and migrates to fine-tune the classifier.Thus, the model is suitable for the current task.The existing few-shot learning methods based on meta-learning assume that the training and test tasks are the same or have a similar distribution.However, these methods face cross-domain classification challenges, such as weak generalization ability and poor classification accuracy.In the training and testing stages, the few-shot learning method based on transfer learning does not consider the inconsistency of sample categories.It also fails to leave enough feature embedding space for new category samples.On the basis of the idea of integrating transfer learning and meta-learning, we propose a model of compressed meta-transfer learning (CMTL)to improve the cross-domain ability of few-shot learning.Method The method is mainly composed of two aspects.On the one hand, for meta-learning, the prior knowledge generated by meta-training is used to complete the classification of the target task.When the source and target tasks have different data distributions, the base class data used for training comes from the source domain natural dataset mini-ImageNet, whereas the novel class data used for testing comes from the target domain medical dataset Chest-X.The meta-knowledge acquired by the source task training cannot be quickly generalized to the target task because of its lack of universality, which further leads to poor cross-domain classification effects.In this study, new auxiliary tasks with strategies, such as random cropping and gamma transformation, were constructed on the support set of the target domain during the testing process.These auxiliary tasks fine-tune the meta-training parameters to improve task adaptability.On the other hand, for transfer learning, the sample categories are consistent during training and testing.Thus, the feature embedding space available for the novel class samples is small if the deep learning model is optimized with the traditional softmax loss function.This scenario further leads to the unsatisfactory feature extraction ability of the model and poor classification accuracy.Given the above problems, this study proposes using a self-compression loss function in the pretraining stage.This loss function adjusts the distribution position between prototypes of the base classes to make the base class samples concentrated in the embedding space and reserve part of the embedding space for the novel class.In the fine-tuning stage, the novel class with large domain distribution is guided to obtain expressive features.The existing studies on cross-domain few-shot learning show that meta-learning methods perform well when the data distributions of the target and source tasks are similar.Conversely, transfer learning methods perform effectively.The ensemble of prediction scores of the above two strategies is regarded as the final classification result to take full advantage of the above two methods.Result This study compares the proposed model with several state-of-the-art cross-domain few-shot image classification models, such as graph convolutional network(GCN), adversarial task augmentation(ATA), self-training to adapt representations to unseen problems(STARTUP), and other classic methods.Compared with the current state-of-theart cross-domain few-shot methods, CMTL has advantages, as shown in the experimental results.In the testing phase, extensive experiments are performed on the 5-way 1-shot and 5-way 5-shot settings to complete the validation of the model and ensure a fair comparison with advanced methods.The experimental results show that on the 5-way 1-shot and 5-way 5-shot settings, the mini-ImageNet is used as the source domain dataset for training.Moreover, the effectiveness of the CMTL on the EuroSAT, ISIC, CropDisease, and Chest-X datasets is tested, and the accuracy rates reach 68.87%/87.74%, 34.47%/49.71%, 74.92%/93.37%, and 22.22%/25.40%, respectively.Compared with meta-learning and transfer learning models, our model achieves competitive results on the 5-way 1-shot and 5-way 5-shot settings on all crossdomain tasks.Conclusion This study proposes a cross-domain few-shot image classification model based on meta-transfer learning.The proposed model improves the generalization ability of few-shot learning.The experimental results show that the CMTL proposed in this study combines the advantages of meta-learning and transfer learning methods.It also has significant effects on cross-domain few-shot tasks.

Keywords

image classification few-shot cross-domain meta learning transfer learning few-shot learning(FSL)