辽宁工程技术大学软件学院,辽宁葫芦岛 125105
[ "姜文涛 男,1986年10月出生于辽宁省大连市.现为辽宁工程技术大学软件学院副教授.主要研究方向为图像与视觉信息计算、模式识别与人工智能.E-mail: lntuwulue@163.com" ]
[ "高原 男,2000年4月出生于辽宁省沈阳市.现为辽宁工程技术大学软件学院在读硕士.主要研究方向为图像与视觉信息计算、模式识别与人工智能.E-mail: 1422822508@qq.com" ]
[ "袁姮 女,1988年2月出生于湖北省黄冈市.现为辽宁工程技术大学软件学院副教授.主要研究方向为图像与视觉信息计算、模式识别与人工智能.E-mail: lntuyuanheng@163.com" ]
[ "刘万军 男,1959年10月出生于辽宁省北镇市.现为辽宁工程技术大学软件学院教授.主要研究方向为软件工程理论、图像与视觉信息计算、模式识别与人工智能.E-mail: liuwanjun39@163.com" ]
收稿:2024-01-26,
修回:2024-06-05,
纸质出版:2024-07-25
移动端阅览
姜文涛, 高原, 袁姮, 等. 门控机制的图像分类网络[J]. 电子学报, 2024, 52(07): 2393-2406.
JIANG Wen-tao, GAO Yuan, YUAN Heng, et al. Image Classification Network of Gating Mechanism[J]. Acta Electronica Sinica, 2024, 52(07): 2393-2406.
姜文涛, 高原, 袁姮, 等. 门控机制的图像分类网络[J]. 电子学报, 2024, 52(07): 2393-2406. DOI:10.12263/DZXB.20240104
JIANG Wen-tao, GAO Yuan, YUAN Heng, et al. Image Classification Network of Gating Mechanism[J]. Acta Electronica Sinica, 2024, 52(07): 2393-2406. DOI:10.12263/DZXB.20240104
为了提取更具表达能力和区分度的重点特征,减少网络传递时关键特征的流失,提高神经网络图像分类能力,提出一种新的门控机制图像分类网络(image classification Network of Gating Mechanism,GMNet).首先,使用门控卷积提取浅层特征,通过门控机制选择性地进行卷积操作,提高网络对原始图像关键特征的提取能力;其次,设计了一种插值门控卷积(Interpolation Gated Convolution,IGC)模块,利用Lanczos插值与门控卷积相结合,强化浅层特征的同时提取更具区分度的特征,提高特征的非线性表达能力;然后,设计了大核门控注意力机制(Large kernel Gated Attention Mechanism,LGAM)模块,将大核注意力与门控卷积相融合,实现了特征的选择性增强和选择性融合,提高关键区域特征的贡献度;最后,将大核门控注意力机制模块嵌入到残差分支中,让模型更有效地学习输入数据的特征和上下文信息,减少关键特征在网络信息传递时流失,提高网络的分类能力.本文方法在图像数据集CIFAR-10、CIFAR100、SVHN、Imagenette、Imagewoof上分别达到了97.05%、83.68%、97.68%、90.60%、83.05%的分类准确率,与当前先进的方法相比分别平均提高了3.26%、7.08%、3.44%、2.65%、5.02%.与现有主流网络模型相较,本文门控机制图像分类网络能够增强特征的非线性表达能力,提取更具表达能力和区分度的重点特征,减少关键特征流失,提高关键区域特征的贡献度,有效地提高神经网络图像分类能力.
To extract more expressive and discriminative key features
reduce the loss of key features during network transmission
and improve the image classification ability of neural networks
a new image classification network of gating mechanism (GMNet) is proposed. Firstly
the shallow features are extracted using gated convolution
and the convolution operation is selectively performed through the gating mechanism to improve the network's ability to extract key features of the original image. Secondly
an interpolation gated convolution (IGC) module is designed
which combines Lanczos interpolation with gated convolution to enhance shallow features while extracting more discriminative features
improving the non-linear expression ability of features. Then
a large kernel gated attention mechanism (LGAM) module is designed
which combines large kernel attention with gated convolution to achieve selective enhancement and fusion of features
and improve the contribution of key region features. Finally
the large kernel gated attention mechanism module is embedded into the residual branch to enable the model to learn input data's features and contextual information more effectively
reduce the loss of key features during network information transmission
and improve the network's classification ability. The method achieved classification accuracy of 97.05%
83.68%
97.68%
90.60%
and 83.05% on image datasets CIFAR-10
CIFAR-100
SVHN
Imagenette
and Imagewoof
respectively
and improved on average by 3.26%
7.08%
3.44%
2.65%
and 5.02% compared to current advanced methods. Compared with existing mainstream network models
the gated mechanism image classification network proposed in this paper can enhance the non-linear expression ability of features
extract more expressive and discriminative vital features
the loss of key features
improve the contribution of key region features
and effectively improve the image classification ability of neural networks.
刘颖 , 庞羽良 , 张伟东 , 等 . 基于主动学习的图像分类技术: 现状与未来 [J ] . 电子学报 , 2023 , 51 ( 10 ): 2960 - 2984 .
LIU Y , PANG Y L , ZHANG W D , et al . Active learning-based image classification technology: Status and future [J ] . Acta Electronica Sinica , 2023 , 51 ( 10 ): 2960 - 2984 . (in Chinese)
许新征 , 李彬 . 基于特征膨胀卷积模块的轻量化技术研究 [J ] . 电子学报 , 2023 , 51 ( 2 ): 355 - 364 .
XU X Z , LI B . Research of lightweight convolution neural network based on feature expansion convolution [J ] . Acta Electronica Sinica , 2023 , 51 ( 2 ): 355 - 364 . (in Chinese)
LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .
KRIZHEVSKY A , SUTSKEVER I , HINTON G E . Imagenet classification with deep convolutional neural networks [J ] . Communications of the ACM , 2017 , 60 ( 6 ): 84 - 90 .
SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2014-09-04 )[ 2024-01-05 ] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .
ZAGORUYKO S , KOMODAKIS N . Wide residual networks [EB/OL ] . ( 2016-05-23 )[ 2024-01-05 ] . https://arxiv.org/pdf/1605.07146.pdf https://arxiv.org/pdf/1605.07146.pdf .
HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 4700 - 4708 .
ABDI M , NAHAVANDI S . Multi-residual networks: Improving the speed and accuracy of residual networks [EB/OL ] . ( 2016-09-19 )[ 2024-01-05 ] . https://arxiv.org/pdf/1609.05672.pdf https://arxiv.org/pdf/1609.05672.pdf .
TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [C ] // International Conference on Machine Learning . San Diego : PMLR , 2019 : 6105 - 6114 .
HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1580 - 1589 .
HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .
HU X F , ZHANG Z H , JIANG Z Y , et al . SPAN: Spatial pyramid attention network for image manipulation localization [C ] // Computer Vision — ECCV 2020 . Cham : Springer International Publishing , 2020 : 312 - 328 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // International Conference on Neural Information Processing Systems . Cambridge : MIT Press , 2017 : 6000 - 6010 .
CHOROMANSKI K , LIKHOSHERSTOV V , DOHAN D , et al . Rethinking attention with performers [EB/OL ] . ( 2020-09-30 )[ 2024-01-05 ] . https://arxiv.org/pdf/2009.14794.pdf https://arxiv.org/pdf/2009.14794.pdf .
LAN H , WANG X H , SHEN H , et al . Couplformer: Rethinking vision transformer with coupling attention [C ] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2023 : 6475 - 6484 .
KONSTANTINIDIS D , PAPASTRATIS I , DIMITROPOULOS K , et al . Multi-manifold attention for vision transformers [EB/OL ] . ( 2022-07-18 )[ 2024-01-05 ] . https://arxiv.org/pdf/2207.08569.pdf https://arxiv.org/pdf/2207.08569.pdf .
YU J H , LIN Z , YANG J M , et al . Free-form image inpainting with gated convolution [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 4471 - 4480 .
郭莹 , 李伦 , 王鹏 . 基于Lanczos核的实时图像插值算法 [J ] . 通信学报 , 2017 , 38 ( 6 ): 142 - 147 .
GUO Y , LI L , WANG P . Real time interpolation algorithm based on Lanczos kernel [J ] . Journal on Communications , 2017 , 38 ( 6 ): 142 - 147 . (in Chinese)
GUO M H , LU C Z , LIU Z N , et al . Visual attention network [J ] . Computational Visual Media , 2023 , 9 ( 4 ): 733 - 752 .
HENDRYCKS D , GIMPEL K . Gaussian error linear units (GELUs) [EB/OL ] . ( 2016-06-27 )[ 2024-01-05 ] . https://arxiv.org/pdf/1606.08415.pdf https://arxiv.org/pdf/1606.08415.pdf .
姜文涛 , 赵琳琳 , 涂潮 . 双分支多注意力机制的锐度感知分类网络 [J ] . 模式识别与人工智能 , 2023 , 36 ( 3 ): 252 - 267 .
JIANG W T , ZHAO L L , TU C . Double-branch multi-attention mechanism based sharpness-aware classification network [J ] . Pattern Recognition and Artificial Intelligence , 2023 , 36 ( 3 ): 252 - 267 . (in Chinese)
QIN Z Q , ZHANG P Y , WU F , et al . FcaNet: Frequency channel attention networks [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 783 - 792 .
HOU Q B , ZHOU D Q , FENG J S . Coordinate attention for efficient mobile network design [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 13713 - 13722 .
ZHANG H , WU C R , ZHANG Z Y , et al . ResNeSt: Split-attention networks [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2022 : 2736 - 2746 .
0
浏览量
13
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621