Research on Lightweight Image Classification Algorithm Based on Multi-Branch Bottleneck Structure

SU Tian-tian; WANG Hui-min; ZHANG Xiao-feng

doi:10.12263/DZXB.20220920

您当前的位置：

首页 >

文章列表页 >

Research on Lightweight Image Classification Algorithm Based on Multi-Branch Bottleneck Structure

PAPERS | 更新时间：2025-12-08

- Research on Lightweight Image Classification Algorithm Based on Multi-Branch Bottleneck Structure
- ACTA ELECTRONICA SINICA Vol. 51, Issue 5, Pages: 1319-1326(2023)
- 作者机构：
  
  陕西师范大学物理学与信息技术学院，陕西西安 710119
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(11874252)
- DOI：10.12263/DZXB.20220920
  CLC： TP391
- Received：05 August 2022，
  
  Revised：2022-11-10，
  
  Published：25 May 2023
- 稿件说明：
移动端阅览
苏田田,王慧敏,张小凤.基于多分支瓶颈结构的轻量型图像分类算法研究[J].电子学报,2023,51(05):1319-1326.

SU Tian-tian,WANG Hui-min,ZHANG Xiao-feng.Research on Lightweight Image Classification Algorithm Based on Multi-Branch Bottleneck Structure[J].ACTA ELECTRONICA SINICA,2023,51(05):1319-1326.
苏田田,王慧敏,张小凤.基于多分支瓶颈结构的轻量型图像分类算法研究[J].电子学报,2023,51(05):1319-1326. DOI： 10.12263/DZXB.20220920.

SU Tian-tian,WANG Hui-min,ZHANG Xiao-feng.Research on Lightweight Image Classification Algorithm Based on Multi-Branch Bottleneck Structure[J].ACTA ELECTRONICA SINICA,2023,51(05):1319-1326. DOI： 10.12263/DZXB.20220920.

摘要

传统卷积神经网络存在着参数量大、训练耗时长、轻量级模型的识别准确度不足的问题.本文提出了一种基于ResNet网络的多分支结构轻量化网络（Residual multi-branch structured Network，RemulbNet），通过在残差结构的主干中使用多分支结构增加特征多样性，利用变体的深度可分离卷积缩减模型参数量，采用Mish激活函数增加网络的非线性表达能力，在有效减少模型体积的情况下，提升网络的分类准确率.利用图像识别数据库，对网络性能进行测试.研究表明，对于5分类花卉识别问题，RemulbNet相比ResNet网络识别准确率提高3.9%，模型参数量减小71%，模型体积减小77%，缩短了约40%训练耗时；与轻量级网络（MobileNet v2和ShuffleNet v2）相比，RemulbNet在识别准确度、模型体积、训练时长和不同的图像分类数据集上都表现出优良的性能.

Abstract

The traditional convolutional neural networks have many problems

such as large number of parameters

long training time

and insufficient recognition accuracy of the lightweight models. Based on ResNet network

a lightweight network named RemulbNet (Residual multi-branch structured Network) with multi-branch structure

which increases feature diversity with multi-branch structure in the backbone of the residual structure

reduces the number of model parameters with the depth-separable convolution of variants

and also increases the nonlinear expression capability of the network with Mish activation function. These measures can effectively reduce the model volume and improve the classification accuracy of the network. Using the image recognition database

the network performance is tested. For 5 categories of flower identification

RemulbNet improves the recognition accuracy by 3.9%

reduces the number of model parameters by 71%

reduces the model volume by 77%

and shortens the training time by about 40% compared with the ResNet network. Facing different image classification datasets

RemulbNet also shows excellent performance in terms of recognition accuracy

model volume

training time compared with the lightweight networks (MobileNet v2 and ShuffleNet v2).

关键词

Keywords

references

FUKUSHIMA K . Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position [J]. Biological Cybernetics , 1980 , 36 ( 4 ): 193 - 202 .

LECUN Y , BOTTOU L . Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [C]// Proceedings of the 25th International Conference on Neural Information Processing Systems-Volume 1 . Lake Tahoe : MIT Press , 2012 : 1106 - 1114 .

TAJBAKHSH N , SHIN J Y , GURUDU S R , et al . Convolutional neural networks for medical image analysis: Full training or fine tuning? [J]. IEEE Transactions on Medical Imaging , 2016 , 35 ( 5 ): 1299 - 1312 .

SZEGEDY C , LIU W , JIA Y , et al . Going deeper with convolutions [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston : IEEE , 2015 : 1 - 9 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL]. ( 2014-09-04 )[ 2022-08-05 ]. https://arxiv.org/abs/1409.1556 https://arxiv.org/abs/1409.1556 .

HE K , ZHANG X , REN S , et al . Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .

IOFFE S , SZEGEDY C . Batch normalization: Accelerating deep network training by reducing internal covariate shift [C]// International Conference on Machine Learning . New York : ACM , 2015 : 448 - 456 .

HOWARD A G , ZHU M , CHEN B , et al . Mobilenets: Efficient convolutional neural networks for mobile vision applications [EB/OL]. ( 2017-04-17 )[ 2022-08-05 ]. https://arxiv.org/abs/1704.04861 https://arxiv.org/abs/1704.04861 .

ZHANG X , ZHOU X , LIN M , et al . Shufflenet: An extremely efficient convolutional neural network for mobile devices [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City, UT, USA : IEEE , 2018 : 6848 - 6856 .

GOMES R , ROZARIO P , ADHIKARI N . Deep learning optimization in remote sensing image segmentation using dilated convolutions and ShuffleNet [C]// 2021 IEEE International Conference on Electro Information Technology(EIT) . Piscataway : IEEE , 2021 : 244 - 249 .

TAN M , LE Q . Efficientnet: Rethinking model scaling for convolutional neural networks [C]// International Conference on Machine Learning . Long Beach, California, USA : ACM , 2019 : 6105 - 6114 .

HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .

BOCHKOVSKIY A , WANG C Y , LIAO H Y M . Yolov4: Optimal speed and accuracy of object detection [EB/OL]. ( 2020-04-23 )[ 2022-08-05 ]. https://arxiv.org/abs/2004.10934 https://arxiv.org/abs/2004.10934 .

SZEGEDY C , IOFFE S , VANHOUCKE V , et al . Inception-v4, inception-ResnNet and the impact of residual connections on learning [C]// Thirty-first AAAI Conference on Artificial Intelligence . San Francisco : AAAI Press , 2017 : 4278 - 4284 .

KAISER L , GOMEZ A N , CHOLLET F . Depthwise separable convolutions for neural machine translation [EB/OL]. ( 2017-06-09 )[ 2022-08-05 ]. https://arxiv.org/abs/1706.03059 https://arxiv.org/abs/1706.03059 .

伍邦谷 , 张苏林 , 石红 , 等 . 基于多分支结构的不确定性局部通道注意力机制 [J]. 电子学报 , 2022 , 50 ( 2 ): 374 - 382 .

WU B G , ZHANG S L , SHI H , et al . Multi-branch structure based local channel attention with uncertainty [J]. Acta Electronica Sinica , 2022 , 50 ( 2 ): 374 - 382 . (in Chinese)

TAFAZZOLI F , FRIGUI H , NISHIYAMA K . A large and diverse dataset for improved vehicle make and model recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 874 - 881 .

TAN M , LE Q . Efficientnetv2: Smaller models and faster training [EB/OL]. ( 2021-04-01 )[ 2022-08-05 ]. https://arxiv.org/abs/2104.00298 https://arxiv.org/abs/2104.00298 .

SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: Inverted residuals and linear bottlenecks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4510 - 4520 .

孟琭 , 徐磊 , 郭嘉阳 . 一种基于改进的MobileNetV2网络语义分割算法 [J]. 电子学报 , 2020 , 48 ( 9 ): 1769 - 1776 .

MENG L , XU L , GUO J Y . Semantic segmentation algorithm based on improved MobileNetV2 [J]. Acta Electronica Sinica , 2020 , 48 ( 9 ): 1769 - 1776 . (in Chinese)

MA N , ZHANG X , ZHENG H T , et al . Shufflenet v2: Practical guidelines for efficient cnn architecture design [C]// European Conference on Computer Vision . Munich, Germany : Springer , 2018 : 122 - 138 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Research of Lightweight Convolution Neural Network Based on Feature Expansion Convolution

Image Classification of Landing Landform Based on Wavelet Transform and Deep Network

Parameterized Pooling Convolution Neural Network for Image Classification

Related Author

ZHANG Xiao-feng

XU Xin-zheng

LI Shan

LIU Fang

HAN Xiao

JIANG Ze-tao

QIN Jia-qi

ZHANG Shao-qin

Related Institution

School of Computer Science and Technology, China University of Mining and Technology

Engineering Research Center of Mining Digital Ministry of Education

Faculty of Information Technology， Beijing University of Technology

Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology

Institute of Information Technology of Guet

⁰