门控机制的图像分类网络

姜文涛; 高原; 袁姮; 刘万军

doi:10.12263/DZXB.20240104

您当前的位置：

首页 >

文章列表页 >

门控机制的图像分类网络

学术论文 | 更新时间：2025-12-24

- 门控机制的图像分类网络
- Image Classification Network of Gating Mechanism
- 电子学报 2024年52卷第7期页码：2393-2406
- 作者机构：
  
  辽宁工程技术大学软件学院，辽宁葫芦岛 125105
- 作者简介：
  
  [ "姜文涛男，1986年10月出生于辽宁省大连市.现为辽宁工程技术大学软件学院副教授.主要研究方向为图像与视觉信息计算、模式识别与人工智能.E-mail: lntuwulue@163.com" ]
  [ "高原男，2000年4月出生于辽宁省沈阳市.现为辽宁工程技术大学软件学院在读硕士.主要研究方向为图像与视觉信息计算、模式识别与人工智能.E-mail: 1422822508@qq.com" ]
  [ "袁姮女，1988年2月出生于湖北省黄冈市.现为辽宁工程技术大学软件学院副教授.主要研究方向为图像与视觉信息计算、模式识别与人工智能.E-mail: lntuyuanheng@163.com" ]
  [ "刘万军男，1959年10月出生于辽宁省北镇市.现为辽宁工程技术大学软件学院教授.主要研究方向为软件工程理论、图像与视觉信息计算、模式识别与人工智能.E-mail: liuwanjun39@163.com" ]
- 基金信息：
  
  国家自然科学基金(61601213);辽宁省自然科学基金(20170540426);辽宁省教育厅重点基金(LJYL049)
- DOI：10.12263/DZXB.20240104
  中图分类号： TP391
- 收稿：2024-01-26，
  
  修回：2024-06-05，
  
  纸质出版：2024-07-25
- 稿件说明：
移动端阅览
姜文涛, 高原, 袁姮, 等. 门控机制的图像分类网络[J]. 电子学报, 2024, 52(07): 2393-2406.

JIANG Wen-tao, GAO Yuan, YUAN Heng, et al. Image Classification Network of Gating Mechanism[J]. Acta Electronica Sinica, 2024, 52(07): 2393-2406.
姜文涛, 高原, 袁姮, 等. 门控机制的图像分类网络[J]. 电子学报, 2024, 52(07): 2393-2406. DOI：10.12263/DZXB.20240104

JIANG Wen-tao, GAO Yuan, YUAN Heng, et al. Image Classification Network of Gating Mechanism[J]. Acta Electronica Sinica, 2024, 52(07): 2393-2406. DOI：10.12263/DZXB.20240104

摘要

为了提取更具表达能力和区分度的重点特征，减少网络传递时关键特征的流失，提高神经网络图像分类能力，提出一种新的门控机制图像分类网络（image classification Network of Gating Mechanism，GMNet）.首先，使用门控卷积提取浅层特征，通过门控机制选择性地进行卷积操作，提高网络对原始图像关键特征的提取能力；其次，设计了一种插值门控卷积（Interpolation Gated Convolution，IGC）模块，利用Lanczos插值与门控卷积相结合，强化浅层特征的同时提取更具区分度的特征，提高特征的非线性表达能力；然后，设计了大核门控注意力机制（Large kernel Gated Attention Mechanism，LGAM）模块，将大核注意力与门控卷积相融合，实现了特征的选择性增强和选择性融合，提高关键区域特征的贡献度；最后，将大核门控注意力机制模块嵌入到残差分支中，让模型更有效地学习输入数据的特征和上下文信息，减少关键特征在网络信息传递时流失，提高网络的分类能力.本文方法在图像数据集CIFAR-10、CIFAR100、SVHN、Imagenette、Imagewoof上分别达到了97.05%、83.68%、97.68%、90.60%、83.05%的分类准确率，与当前先进的方法相比分别平均提高了3.26%、7.08%、3.44%、2.65%、5.02%.与现有主流网络模型相较，本文门控机制图像分类网络能够增强特征的非线性表达能力，提取更具表达能力和区分度的重点特征，减少关键特征流失，提高关键区域特征的贡献度，有效地提高神经网络图像分类能力.

Abstract

To extract more expressive and discriminative key features

reduce the loss of key features during network transmission

and improve the image classification ability of neural networks

a new image classification network of gating mechanism (GMNet) is proposed. Firstly

the shallow features are extracted using gated convolution

and the convolution operation is selectively performed through the gating mechanism to improve the network's ability to extract key features of the original image. Secondly

an interpolation gated convolution (IGC) module is designed

which combines Lanczos interpolation with gated convolution to enhance shallow features while extracting more discriminative features

improving the non-linear expression ability of features. Then

a large kernel gated attention mechanism (LGAM) module is designed

which combines large kernel attention with gated convolution to achieve selective enhancement and fusion of features

and improve the contribution of key region features. Finally

the large kernel gated attention mechanism module is embedded into the residual branch to enable the model to learn input data's features and contextual information more effectively

reduce the loss of key features during network information transmission

and improve the network's classification ability. The method achieved classification accuracy of 97.05%

83.68%

97.68%

90.60%

and 83.05% on image datasets CIFAR-10

CIFAR-100

SVHN

Imagenette

and Imagewoof

respectively

and improved on average by 3.26%

7.08%

3.44%

2.65%

and 5.02% compared to current advanced methods. Compared with existing mainstream network models

the gated mechanism image classification network proposed in this paper can enhance the non-linear expression ability of features

extract more expressive and discriminative vital features

the loss of key features

improve the contribution of key region features

and effectively improve the image classification ability of neural networks.

关键词

Keywords

references

刘颖 , 庞羽良 , 张伟东 , 等 . 基于主动学习的图像分类技术: 现状与未来 [J ] . 电子学报 , 2023 , 51 ( 10 ): 2960 - 2984 .

LIU Y , PANG Y L , ZHANG W D , et al . Active learning-based image classification technology: Status and future‍ [J ] . Acta Electronica Sinica , 2023 , 51 ( 10 ): 2960 - 2984 . (in Chinese)

许新征 , 李彬 . 基于特征膨胀卷积模块的轻量化技术研究 [J ] . 电子学报 , 2023 , 51 ( 2 ): 355 - 364 .

XU X Z , LI B . Research of lightweight convolution neural network based on feature expansion convolution [J ] . Acta Electronica Sinica , 2023 , 51 ( 2 ): 355 - 364 . (in Chinese)

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . Imagenet classification with deep convolutional neural networks [J ] . Communications of the ACM , 2017 , 60 ( 6 ): 84 - 90 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2014-09-04 )[ 2024-01-05 ] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

ZAGORUYKO S , KOMODAKIS N . Wide residual networks [EB/OL ] . ( 2016-05-23 )[ 2024-01-05 ] . https://arxiv.‍org/pdf/1605.07146.pdf https://arxiv.‍org/pdf/1605.07146.pdf .

HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 4700 - 4708 .

ABDI M , NAHAVANDI S . Multi-residual networks: Improving the speed and accuracy of residual networks [EB/OL ] . ( 2016-09-19 )[ 2024-01-05 ] . https://arxiv.org/pdf/1609.05672.pdf https://arxiv.org/pdf/1609.05672.pdf .

TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [C ] // International Conference on Machine Learning . San Diego : PMLR , 2019 : 6105 - 6114 .

HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1580 - 1589 .

HU J , SHEN L , SUN G . Squeeze-and-excitation networks‍ [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .

HU X F , ZHANG Z H , JIANG Z Y , et al . SPAN: Spatial pyramid attention network for image manipulation localization [C ] // Computer Vision — ECCV 2020 . Cham : Springer International Publishing , 2020 : 312 - 328 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // International Conference on Neural Information Processing Systems . Cambridge : MIT Press , 2017 : 6000 - 6010 .

CHOROMANSKI K , LIKHOSHERSTOV V , DOHAN D , et al . Rethinking attention with performers [EB/OL ] . ( 2020-09-30 )[ 2024-01-05 ] . https://arxiv.org/pdf/2009.14794.pdf https://arxiv.org/pdf/2009.14794.pdf .

LAN H , WANG X H , SHEN H , et al . Couplformer: Rethinking vision transformer with coupling attention [C ] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2023 : 6475 - 6484 .

KONSTANTINIDIS D , PAPASTRATIS I , DIMITROPOULOS K , et al . Multi-manifold attention for vision transformers [EB/OL ] . ( 2022-07-18 )[ 2024-01-05 ] . https://arxiv.org/pdf/2207.08569.pdf https://arxiv.org/pdf/2207.08569.pdf .

YU J H , LIN Z , YANG J M , et al . Free-form image inpainting with gated convolution [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 4471 - 4480 .

郭莹 , 李伦 , 王鹏 . 基于Lanczos核的实时图像插值算法 [J ] . 通信学报 , 2017 , 38 ( 6 ): 142 - 147 .

GUO Y , LI L , WANG P . Real time interpolation algorithm based on Lanczos kernel [J ] . Journal on Communications , 2017 , 38 ( 6 ): 142 - 147 . (in Chinese)

GUO M H , LU C Z , LIU Z N , et al . Visual attention network [J ] . Computational Visual Media , 2023 , 9 ( 4 ): 733 - 752 .

HENDRYCKS D , GIMPEL K . Gaussian error linear units (GELUs) [EB/OL ] . ( 2016-06-27 )[ 2024-01-05 ] . https://arxiv.org/pdf/1606.08415.pdf https://arxiv.org/pdf/1606.08415.pdf .

姜文涛 , 赵琳琳 , 涂潮 . 双分支多注意力机制的锐度感知分类网络 [J ] . 模式识别与人工智能 , 2023 , 36 ( 3 ): 252 - 267 .

JIANG W T , ZHAO L L , TU C . Double-branch multi-attention mechanism based sharpness-aware classification network [J ] . Pattern Recognition and Artificial Intelligence , 2023 , 36 ( 3 ): 252 - 267 . (in Chinese)

QIN Z Q , ZHANG P Y , WU F , et al . FcaNet: Frequency channel attention networks [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 783 - 792 .

HOU Q B , ZHOU D Q , FENG J S . Coordinate attention for efficient mobile network design [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 13713 - 13722 .

ZHANG H , WU C R , ZHANG Z Y , et al . ResNeSt: Split-attention networks [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2022 : 2736 - 2746 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

背景感知机制的图像分类网络

基于坐标重要性池化和解耦类别对齐蒸馏的图像分类算法

基于特征异常检测与伪标签回归的无监督对抗域适应

基于图表征知识蒸馏的图像分类方法