电子学报 ›› 2022, Vol. 50 ›› Issue (2): 295-304.DOI: 10.12263/DZXB.20210453
所属专题: 长摘要论文
李维刚, 甘平, 谢璐, 李松涛
收稿日期:
2021-04-09
修回日期:
2021-10-31
出版日期:
2022-02-25
作者简介:
基金资助:
LI Wei-gang, GAN Ping, XIE Lu, LI Song-tao
Received:
2021-04-09
Revised:
2021-10-31
Online:
2022-02-25
Published:
2022-02-25
Supported by:
摘要:
本文针对小样本图像分类问题,提出一种基于样本对的元学习(Pairwise-based Meta Learning,PML)方法.利用传递迁移学习对预训练好的Resnet50模型进行微调,得到一个更适应小样本任务的特征编码器,将该特征编码器作为元学习模型的初始特征编码器来训练模型,进一步增强了元学习模型的泛化能力;同时,本文还基于支持集与查询集样本之间的相似性提出元损失函数(Meta Loss,ML),其考虑了特征空间中查询集所有样本的相互关系,以此来缩小正样本类内距离,增加正负样本类间距离,从而提高分类精度.实验结果表明,本文的方法在1-shot、5-shot任务上分别达到了77.65%、89.65%的分类精度,较最新的元学习方法Meta-baseline分别提高7.38%、5.65%.
中图分类号:
李维刚, 甘平, 谢璐, 等. 基于样本对元学习的小样本图像分类方法[J]. 电子学报, 2022, 50(2): 295-304.
Wei-gang LI, Ping GAN, Lu XIE, et al. A Few-Shot Image Classification Method by Pairwise-Based Meta Learning[J]. Acta Electronica Sinica, 2022, 50(2): 295-304.
Recall | S/% | N/% | P/% | SN1/% | SN2/% | SP/% | NP/% | SNP/% |
---|---|---|---|---|---|---|---|---|
1 | 71.9 | 69.7 | 67.0 | 70.4 | 73.2 | 74.6 | 72.2 | 77.3 |
2 | 80.0 | 79.3 | 77.4 | 79.5 | 81.5 | 83.8 | 81.7 | 85.3 |
4 | 86.4 | 86.2 | 84.7 | 86.2 | 87.6 | 89.7 | 88.0 | 90.5 |
8 | 91.0 | 91.0 | 90.0 | 91.1 | 92.6 | 94.1 | 92.4 | 94.2 |
表1 三种相似性在图像检索任务上的性能表现(数据来源于文献[16])
Recall | S/% | N/% | P/% | SN1/% | SN2/% | SP/% | NP/% | SNP/% |
---|---|---|---|---|---|---|---|---|
1 | 71.9 | 69.7 | 67.0 | 70.4 | 73.2 | 74.6 | 72.2 | 77.3 |
2 | 80.0 | 79.3 | 77.4 | 79.5 | 81.5 | 83.8 | 81.7 | 85.3 |
4 | 86.4 | 86.2 | 84.7 | 86.2 | 87.6 | 89.7 | 88.0 | 90.5 |
8 | 91.0 | 91.0 | 90.0 | 91.1 | 92.6 | 94.1 | 92.4 | 94.2 |
Task | Weighting | Mining | Class | Class | Time/Train task | Weight File Size |
---|---|---|---|---|---|---|
1-shot 1 024维 | Our method | Our method | 98.85 | 75.87 | 33 581KB | |
MS mining method | 0.256 s | |||||
1-shot 2 048维 | Our method | Our method | 0.192 s | 92 145KB | ||
MS mining method | 98.79 | 76.16 | 0.256 s | |||
5-shot 1 024维 | Our method | Our method | 99.50 | 0.224 s | 33 581KB | |
MS mining method | 89.45 | |||||
5-shot | Our method | Our method | 88.40 | 0.224 s | 92 145KB | |
2 048维 | MS mining method | 99.64 |
表2 不同挖掘方法下样本平均精度(%)的置信区间及单个训练任务耗时的对比
Task | Weighting | Mining | Class | Class | Time/Train task | Weight File Size |
---|---|---|---|---|---|---|
1-shot 1 024维 | Our method | Our method | 98.85 | 75.87 | 33 581KB | |
MS mining method | 0.256 s | |||||
1-shot 2 048维 | Our method | Our method | 0.192 s | 92 145KB | ||
MS mining method | 98.79 | 76.16 | 0.256 s | |||
5-shot 1 024维 | Our method | Our method | 99.50 | 0.224 s | 33 581KB | |
MS mining method | 89.45 | |||||
5-shot | Our method | Our method | 88.40 | 0.224 s | 92 145KB | |
2 048维 | MS mining method | 99.64 |
Index | Loss | Setup | 1-shot/1 024维 | 1-shot/2 048维 | 5-shot/1 024维 | 5-shot/2 048维 |
---|---|---|---|---|---|---|
① | Cross entropy Loss | 75.85 | 75.99 | 89.56 | ||
② | MS weighting+ MS mining | 87.93 | ||||
③ | MS weighting+Our mining | 75.51 | 76.69 | 89.60 | 88.21 | |
④ | ML- | 75.60 | 76.72 | 89.61 | 88.32 | |
⑤ | ML- | 75.59 | 76.75 | 89.61 | 88.23 | |
⑥ | ML(our) |
表3 不同损失函数在新类N上样本平均精度(%)的置信区间对比
Index | Loss | Setup | 1-shot/1 024维 | 1-shot/2 048维 | 5-shot/1 024维 | 5-shot/2 048维 |
---|---|---|---|---|---|---|
① | Cross entropy Loss | 75.85 | 75.99 | 89.56 | ||
② | MS weighting+ MS mining | 87.93 | ||||
③ | MS weighting+Our mining | 75.51 | 76.69 | 89.60 | 88.21 | |
④ | ML- | 75.60 | 76.72 | 89.61 | 88.32 | |
⑤ | ML- | 75.59 | 76.75 | 89.61 | 88.23 | |
⑥ | ML(our) |
Model | Transfer model | PML | ||
---|---|---|---|---|
Task | 1-shot | 5-shot | 1-shot | 5-shot |
Class | 88.57 | 95.17 | 92.27 | 96.97 |
表4 不同模型下2类样本平均精度(%)的置信区间
Model | Transfer model | PML | ||
---|---|---|---|---|
Task | 1-shot | 5-shot | 1-shot | 5-shot |
Class | 88.57 | 95.17 | 92.27 | 96.97 |
Model | 1-shot | 5-shot |
---|---|---|
Matching Networks[ | ||
MAML[ | 49.27 | 67.28 |
Prototypical Networks[ | 51.63 | 66.16 |
Relation Networks[ | 54.70 | 71.51 |
Meta-Baseline[ | 70.27 | 84.00 |
Pre-training model(ours) | 63.53 | 82.43 |
Transfer model(ours) | 72.41 | 83.98 |
PML(ours) |
表5 不同元学习对比(在新类上样本平均精度(%)的置信区间)
Model | 1-shot | 5-shot |
---|---|---|
Matching Networks[ | ||
MAML[ | 49.27 | 67.28 |
Prototypical Networks[ | 51.63 | 66.16 |
Relation Networks[ | 54.70 | 71.51 |
Meta-Baseline[ | 70.27 | 84.00 |
Pre-training model(ours) | 63.53 | 82.43 |
Transfer model(ours) | 72.41 | 83.98 |
PML(ours) |
Model | Backone | 1-shot | 5-shot | Model | Backone | 1-shot | 5-shot |
---|---|---|---|---|---|---|---|
Matching Networks[ | ConvNet-4 | Prototypical Networks[ | ConvNet-4 | 48.70 | 63.11 | ||
Activation to Parameter[ | WRN-28-10 | 59.60 | 73.74 | LEO[ | WRN-28-10 | 61.76 | 77.59 |
Chen,et al[ | ResNet-18 | 51.87 | 75.68 | SNAIL[ | ResNet-12 | 55.71 | 68.88 |
AdaResNet[ | ResNet-12 | 56.88 | 71.94 | TADAM[ | ResNet-12 | 58.50 | 76.70 |
MTL[ | ResNet-12 | 61.20 | 75.50 | MetaOptNet #[ | ResNet-12 | 62.64 | 78.63 |
MetaOptNet[ | ResNet-12 | 60.33 | 76.61 | Meta-Baseline[ | ResNet-12 | 63.17 | 79.26 |
E³BM[ | ResNet-12 | 63.80 | 80.10 | PSST[ | ResNet-12 | 64.05 | 80.24 |
Meta-Baseline+ML(our) | ResNet-12 |
表6 不同元学习方法在mini-ImageNet数据集上样本平均精度(%)的置信区间(#指应用DropBlock[22]和标签平滑.结果参考文献[9,23,24])
Model | Backone | 1-shot | 5-shot | Model | Backone | 1-shot | 5-shot |
---|---|---|---|---|---|---|---|
Matching Networks[ | ConvNet-4 | Prototypical Networks[ | ConvNet-4 | 48.70 | 63.11 | ||
Activation to Parameter[ | WRN-28-10 | 59.60 | 73.74 | LEO[ | WRN-28-10 | 61.76 | 77.59 |
Chen,et al[ | ResNet-18 | 51.87 | 75.68 | SNAIL[ | ResNet-12 | 55.71 | 68.88 |
AdaResNet[ | ResNet-12 | 56.88 | 71.94 | TADAM[ | ResNet-12 | 58.50 | 76.70 |
MTL[ | ResNet-12 | 61.20 | 75.50 | MetaOptNet #[ | ResNet-12 | 62.64 | 78.63 |
MetaOptNet[ | ResNet-12 | 60.33 | 76.61 | Meta-Baseline[ | ResNet-12 | 63.17 | 79.26 |
E³BM[ | ResNet-12 | 63.80 | 80.10 | PSST[ | ResNet-12 | 64.05 | 80.24 |
Meta-Baseline+ML(our) | ResNet-12 |
Model | Backone | 1-shot | 5-shot | Model | Backone | 1-shot | 5-shot |
---|---|---|---|---|---|---|---|
MAML[ | ConvNet-4 | Prototypical Networks[ | ConvNet-4 | 53.31 | 72.69 | ||
Relation Networks[ | ConvNet-4 | 54.48 | 71.32 | LEO[ | WRN-28-10 | 66.33 | 81.44 |
MetaOptNet[ | ResNet-12 | 65.99 | 81.56 | Meta-Baseline[ | ResNet-12 | 83.29 | |
Meta-Baseline+ML(our) | ResNet-12 | 68.09 |
表7 不同元学习方法在tiered-ImageNet数据集上样本平均精度(%)的置信区间(参考文献[9,13]的结果)
Model | Backone | 1-shot | 5-shot | Model | Backone | 1-shot | 5-shot |
---|---|---|---|---|---|---|---|
MAML[ | ConvNet-4 | Prototypical Networks[ | ConvNet-4 | 53.31 | 72.69 | ||
Relation Networks[ | ConvNet-4 | 54.48 | 71.32 | LEO[ | WRN-28-10 | 66.33 | 81.44 |
MetaOptNet[ | ResNet-12 | 65.99 | 81.56 | Meta-Baseline[ | ResNet-12 | 83.29 | |
Meta-Baseline+ML(our) | ResNet-12 | 68.09 |
1 | VINYALSO, BLUNDELLC, LILLICRAPT, et al. Matching networks for one shot learning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: MIT Press, 2016: 3630-3638. |
2 | 季鼎承, 蒋亦樟, 王士同. 基于域与样例平衡的多源迁移学习方法[J]. 电子学报, 2019, 47(3): 692-699. |
JIDing-cheng, JIANGYi-zhang, WANGShi-tong. Multi-source transfer learning method by balancing both the domains and instances[J]. Acta Electronica Sinica, 2019, 47(3): 692-699. (in Chinese) | |
3 | PANS J, YANGQ. A survey on transfer learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22(10): 1345-1359. |
4 | 李凡长, 刘洋, 吴鹏翔, 等. 元学习研究综述[J]. 计算机学报, 2021, 44(2): 422-446. |
LIFan-chang, LIUYang, WUPeng-xiang, et al. A surveyon recent advancesin meta-learning[J]. Chinese Journal of Computers, 2021, 44(2): 422-446. (in Chinese) | |
5 | LAKEB, SALAKHUTDINOVR, GROSSJ, et al. One shot learning of simple visual concepts[C]//Proceedings of the Annual Meeting of the Cognitive Science Society. California: eScholarship, 2011: 33(33). |
6 | SNELLJ, SWERSKYK, ZEMELR. Prototypical networks for few-shot learning[C]//Proceedings of the International Conference on Neural Information Processing Systems. Massachusetts, USA: MIT Press, 2017: 4077-4087. |
7 | TANB, SONGY, ZHONGE, et al. Transitive transfer learning[C]//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2015: 1155-1164. |
8 | SUNQ, LIUY, CHUAT S, et al. Meta-transfer learning for few-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 403-412. |
9 | CHENY, WANGX, LIUZ, et al. A new meta-baseline for few-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 04390. |
10 | LUBBERSN, LOOKMANT, BARROSK. Inferring low-dimensional microstructure representations using convolutional neural networks[J]. Physical Review E, 2017, 96(5): 052111. |
11 | DECOSTB L, FRANCIST, HOLME A. Exploring the microstructure manifold: Image texture representations applied to ultrahigh carbon steel microstructures[J]. Acta Materialia, 2017, 133: 30-40. |
12 | AZIMIS M, BRITZD, ENGSTLERM, et al. Advanced steel microstructural classification by deep learning methods[J]. Scientific Reports, 2018, 8(1): 2128. |
13 | LEEK, MAJIS, RAVICHANDRANA, SOATTOS. Meta-learning with differentiable convex optimization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 10657-10665. |
14 | RENM, TRIANTAFILLOUE, RAVIS, et al. Meta-learning for semi-supervised few-shot classification[C]//Proceedings of the International Conference on Learning Representations. https://openreview.net, 2018: 1803.00676. |
15 | HEK, ZHANGX, RENS, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778. |
16 | WANGX, HANX, HUANGW, et al. Multi-similarity loss with general pair weighting for deep metric learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 5022-5030. |
17 | WEINBERGERK Q, SAULL K. Distance metric learning for large margin nearest neighbor classification[J]. Journal of Machine Learning Research, 2009, 10(2): 207-244. |
18 | LOSHCHILOVI, HUTTERF. Decoupled weight decay regularization[C]//Proceedings of the European International Conference on Learning Representations(ICLR). https://openreview.net, 2017: 1711.05101. |
19 | ZEILERMD, TAYLORG W, FERGUSR. Adaptive deconvolutional networks for mid and high level feature learning[C]//Proceedings of the International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011: 2018-2025. |
20 | FINNC, ABBEELP, LEVINES. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning. New York, USA: ACM, 2017: 1126-1135. |
21 | SUNGF, YANGY, ZHANGL, et al. Learning to compare: Relation network for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 1199-1208. |
22 | GHIASIG, LINTY, LEQ V. Dropblock: A regularization method for convolutional networks[C]//Proceedings of the Advances in Neural Information Processing Systems(NIPS). Massachusetts, USA: MIT Press, 2018: 10727-10737. |
23 | LIUY, SCHIELEB, SUNQ. An ensemble of epoch-wise empirical Bayes for few-shot learning[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, Cham, 2020: 404-421. |
24 | CHENZ, GEJ, ZHANH, et al. Pareto self-supervised training for few-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021: 13663-13672. |
25 | QIAOS, LIUC, SHENW, et al. Few-shot image recognition by predicting parameters from activations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7229-7238. |
26 | RUSUA A, RAOD, SYGNOWSKIJ, et al. Meta-learning with latent embedding optimization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 1807. 05960. |
27 | CHENWY, LIUYC, KIRAZ, et al. A closer look at few-shot classification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 1904. 04232. |
28 | MISHRAN, ROHANINEJADM, et al. A simple neural attentive meta-learner[C]//Proceedings of the 6th International Conference on Learning Representations. https://openreview.net, 2018: 1707.03141. |
29 | MUNKHDALAIT, YUANX, MEHRIS, et al. Rapid adaptation with conditionally shifted neurons[C]//Proceedings of the International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018: 3664-3673. |
30 | ORESHKINB, L'OPEZP R, LACOSTEA. Tadam: Task dependent adaptive metric for improved few-shot learning[C]//Proceedings of the Advances in Neural Information Processing Systems. USA Massachusetts: MIT Press, 2018: 721-731. |
[1] | 彭锦佳, 王辉兵. 基于异构卷积神经网络集成的无监督行人重识别方法[J]. 电子学报, 2023, (): 1-13. |
[2] | 郭凯红, 崔明茜, 刘婷婷. 模糊知识测度下图像脉冲噪声去除方法[J]. 电子学报, 2023, (): 1-14. |
[3] | 吕杭, 蒋明峰, 李杨, 张鞠成, 王志康. 基于混合时频域特征的卷积神经网络心律失常分类方法的研究[J]. 电子学报, 2023, 51(3): 701-711. |
[4] | 张晶, 王翌歆, 任永功. 统一全局空间表达的脑电信号跨被试情感识别[J]. 电子学报, 2023, (): 1-9. |
[5] | 但志平, 方帅领, 孙航, 李晶, 万俊. 基于双判别器异构CycleGAN框架下多阶通道注意力校准的室外图像去雾[J]. 电子学报, 2023, (): 1-14. |
[6] | 张智, 易华挥, 郑锦. 聚焦小目标的航拍图像目标检测算法[J]. 电子学报, 2023, (): 1-12. |
[7] | 姚睿, 朱享彬, 周勇, 王鹏, 张艳宁, 赵佳琦. 基于重要特征的视觉目标跟踪可迁移黑盒攻击方法[J]. 电子学报, 2023, (): 1-9. |
[8] | 孙锐, 张磊, 余益衡, 张旭东. 基于局部异构聚合图卷积网络的跨模态行人重识别[J]. 电子学报, 2023, (): 1-16. |
[9] | 张杰, 廖盛斌, 张浩峰, 陈得宝. 基于类别扩展的广义零样本图像分类方法[J]. 电子学报, 2023, (): 1-13. |
[10] | 吴晓雨, 蒲禹江, 王生进, 刘子豪. 基于语义嵌入学习的特类视频识别[J]. 电子学报, 2023, (): 1-13. |
[11] | 苏天康, 宋慧慧, 樊佳庆, 张开华. 深度信号引导学习混合变换器的高性能无监督视频目标分割[J]. 电子学报, 2023, (): 1-8. |
[12] | 林彬, 王华通, 封全喜. 基于双模型竞争机制的目标跟踪算法[J]. 电子学报, 2023, (): 1-7. |
[13] | 王子为, 鲁继文, 周杰. 基于自适应梯度优化的二值神经网络[J]. 电子学报, 2023, 51(2): 257-266. |
[14] | 唐利明, 熊点华, 方壮. 基于比尔朗伯定律的变分水平集模型[J]. 电子学报, 2023, 51(2): 416-426. |
[15] | 杨利平, 侯振威, 辜小花, 郝峻永. 弱标签声音事件检测的空间-通道特征表征与自注意池化[J]. 电子学报, 2023, 51(2): 297-306. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||