Multi-Branch Structure Based Local Channel Attention with Uncertainty

WU Bang-gu; ZHANG Su-lin; SHI Hong; ZHU Peng-fei; WANG Qi-long; HU Qing-hua

doi:10.12263/DZXB.20201204

您当前的位置：

首页 >

文章列表页 >

Multi-Branch Structure Based Local Channel Attention with Uncertainty

PAPERS | 更新时间：2025-12-08

- Multi-Branch Structure Based Local Channel Attention with Uncertainty
- ACTA ELECTRONICA SINICA Vol. 50, Issue 2, Pages: 374-382(2022)
- 作者机构：
  
  1.天津大学智能与计算学部,天津 300350
  2.中汽数据(天津)有限公司,天津 300380
- 作者简介：
- 基金信息：
- DOI：10.12263/DZXB.20201204
  CLC： TP183;TP391.4
- Received：28 October 2020，
  
  Revised：2021-01-25，
  
  Published：25 February 2022
- 稿件说明：
移动端阅览
伍邦谷,张苏林,石红等.基于多分支结构的不确定性局部通道注意力机制[J].电子学报,2022,50(02):374-382.

WU Bang-gu,ZHANG Su-lin,SHI Hong,et al.Multi-Branch Structure Based Local Channel Attention with Uncertainty[J].ACTA ELECTRONICA SINICA,2022,50(02):374-382.
伍邦谷,张苏林,石红等.基于多分支结构的不确定性局部通道注意力机制[J].电子学报,2022,50(02):374-382. DOI： 10.12263/DZXB.20201204.

WU Bang-gu,ZHANG Su-lin,SHI Hong,et al.Multi-Branch Structure Based Local Channel Attention with Uncertainty[J].ACTA ELECTRONICA SINICA,2022,50(02):374-382. DOI： 10.12263/DZXB.20201204.

摘要

近几年的研究表明视觉注意力机制是提升深层卷积神经网络性能的有效途径.然而，现有的视觉注意力方法更多地致力于建模所有卷积通道之间的相关性，在一定程度上限制了模型的计算效率.此外，这些方法尚未明确考虑相关性建模过程中不确定性带来的影响，缺少对注意力机制在泛化能力和稳定性方面的探索.为解决上述问题，提出了一种多分支局部通道注意力模块（Multi-Branch Local Channel Attention，MBLCA）.通过建模通道之间的局部相关性学习各个通道的权重，提升了模型的计算效率.并采用蒙特卡洛（Monte Carlo，MC）Dropout近似的深度贝叶斯学习方法对局部通道注意力模块进行不确定性建模，从而得到一个多分支的局部通道注意力模块.提出的MBLCA模块可以灵活地应用于各种深层卷积神经网络架构中，与同类型的工作相比，嵌入MBLCA模块的ResNet-50网络结构在ImageNet-1K和MS COCO数据集上分别取得了2.58%的分类精度提升和1.9%的AP提升.

Abstract

Recent researches demonstrate that attention mechanism is an effective way to improve performance of deep convolution neural networks(CNNs). However

most of existing attention methods more dedicate to modeling the correlation between all channels

which limits the computational efficiency of the model. In addition

these methods have not considered the impact of uncertainty in the correlation modeling process

and lack the exploration of the generalization ability and stability of the attention mechanism. A multi-branch local channel attention(MBLCA) module is proposed to handle above issues. MBLCA learns channel attention by capturing correlation across channels in a local range instead of global ones

improving the computational efficiency

and models the uncertainty of local channel attention by deep Bayesian learning

which is approximated by Monte Carlo(MC) Dropout

leading a multi-branch structure. The proposed MBLCA can be flexibly adopted to various deep CNN architectures. For example

ResNet-50 with the MBLCA module has achieved 2.58% improvement in classification accuracy and 1.9% improvement in average precise on the ImageNet-1K and MS COCO datasets against state-of-the-art counterparts.

关键词

Keywords

references

YOSHUA B , AARON C , PASCAL V . Representation learning: A review and new perspectives [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2013 , 35 ( 8 ): 1798 - 1828 .

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [C]// Advances in Neural Information Processing Systems . Lake Tahoe, Ne : MIT Press , 2012 : 1097 - 1105 .

DENG J , DONG W , SOCHER R , LI L , LI K , LI F F . ImageNet: A large-scale hierarchical image database [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Miami, FL, USA : IEEE , 2009 : 248 - 255 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [J]. arXiv: 1409.1556 , 2015 .

SZEGEDY C , LIU W , JIA Y , et al . Going deeper with convolutions [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston, MA, USA : IEEE , 2015 : 1 - 9 . DOI: 10.1109/CVPR.2015.7298594 http://dx.doi.org/10.1109/CVPR.2015.7298594 .

HE K , ZHANG X , REN S , SUN J . Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, NV, USA : IEEE , 2016 : 770 - 778 .

HUANG G , LIU Z , MAATEN L , WEINBERGER K Q . Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, HI, USA : IEEE , 2017 : 2261 - 2269 .

HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City, UT, USA : IEEE , 2018 : 18 - 23 .

WOO S , PARK J , LEE J , KWEON I S . CBAM: Convolutional block attention module [C]// European Conference on Computer Vision . Munich, Germany : Springer , 2018 : 3 - 19 .

BELLO I , ZOPH B , VASWANI A , SHLENS J , LE Q . Attention augmented convolutional networks [C]// Proceedings of the IEEE International Conference on Computer Vision . Seoul, Korea (South) : IEEE , 2019 : 3285 - 3294 .

CHEN Y , KALANTIDIS Y , LI J , YAN S , FENG J . A 2 -Nets: Double attention networks [C]// Advances In Neural Information Processing Systems . Montreal, Canada : MIT Press , 2018 : 352 - 361 .

FU J , LIU J , TIAN H , LI Y , BAO Y , FANG Z , LU H . Dual attention network for scene segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Long Beach, CA, USA : IEEE , 2019 : 3146 - 3154 .

GAO Z , XIE J , WANG Q , LI P . Global second-order pooling convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Long Beach, CA, USA : IEEE , 2019 : 3024 - 3033 .

WANG X , GIRSHICK R , GUPTA A , HE K . Non-local neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City, UT, USA : IEEE , 2018 : 7794 - 7803 .

WANG Q , WU B , ZHU P , LI P , ZUO W , HU Q . ECA-Net: Efficient channel attention for deep convolutional neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Seattle, WA, USA : IEEE , 2020 : 11531 - 11539 .

DAVID J C MACKAY . A practical Bayesian framework for backpropagation networks [J]. Neural Computation . 1992 , 4 ( 3 ): 448 - 472 .

GRAVES A . Practical variational inference for neural networks [C]// Advances in Neural Information Processing Systems . Red Hook, USA : MIT Press , 2011 : 2348 - 2356 .

GAL Y , GHAHRAMANI Z . Dropout as a Bayesian approximation: Representing model uncertainty in deep learning [C]// International Conference on Machine Learning . New York, NY, USA : ACM , 2016 : 1050 - 1059 .

SRIVASTAVA N , HINTON G E , KRIZHEVSKY A , SUTSKEVER I , SALAKHUTDINOV R . Dropout: A simple way to prevent neural networks from overfitting [J]. The Journal of Machine Learning Research , 2014 , 15 ( 1 ): 1929 - 1958 .

LIN T , MAIRE M , BELONGIE S J , HAYS J , PERONA P , RAMANAN D , DOLL'AR P , ZITNICK C L . Microsoft COCO: Common objects in context [C]// European Conference on Computer Vision . Munich, Germany : Springer , 2014 : 8693 : 740 - 755 .

仇祝令 , 查宇飞 , 吴敏 , 等 . 基于注意力学习的正则化相关滤波跟踪算法 [J]. 电子学报 , 2020 , 48 ( 9 ): 1762 - 1768 .

QIU Z L , ZHA Y F , WU M , et al . Learning attentional regularized correlation filter for visual tracking [J]. Acta Electronica Sinica , 2020 , 48 ( 9 ): 1762 - 1768 . (in Chinese)

QIN P , SHEN W , ZENG J . DSCA-Net: Indoor head detection network using dual-stream information and channel attention [J]. Chinese Journal of Electronics , 2020 , 29 ( 6 ): 1102 - 1109 .

盖杉 , 王俊生 . 基于深度学习的非局部注意力增强网络图像去雨算法研究 [J]. 电子学报 , 2020 , 48 ( 10 ): 1899 - 1908 .

GAI S , WANG J S . Image raindrop algorithm research using nonlocal attention enhanced network based on deep learning [J]. Acta Electronica Sinica , 2020 , 48 ( 10 ): 1899 - 1908 . (in Chinese)

CAO Y , XU J , LIN S , WEI F , HU H . GCnet: Non-local networks meet squeeze-excitation networks and beyond [C]// International Conference on Computer Vision Workshops . Seoul, Korea(South) : IEEE , 2019 : 1971 - 1980 .

PAISLEY J W , BLEI D M , JORDAN M I . Variational Bayesian inference with stochastic search [C]// International Conference on Machine Learning . Edinburgh : ACM , 2012 .

KINGMA D P , WELLING M . Auto-encoding variational Bayes [C]// International Conference on Learning Representations . Banff, AB, Canada : IEEE , 2014 : 1 - 14 .

HOFFMAN M D , B LEI D M , WANG C , PAISLEY J W . Stochastic variational inference [J]. The Journal of Machine Learning Research . 2013 , 14 ( 1 ): 1303 - 1347 .

齐现英 , 刘伯强 , 徐建伟 . 基于不确定性信息融合的高密度椒盐噪声降噪方法 [J]. 电子学报 , 2016 , 44 ( 4 ): 878 - 885 .

QI X Y , LIU B Q , XU J W . A novel algorithm for removing high-density salt-and-pepper noise based on fusion of indeterminacy information [J]. Acta Electronica Sinica , 2016 , 44 ( 4 ): 878 - 885 . (in Chinese)

张挺 , 刘金华 . 一种新的空间数据不确定性重建方法 [J]. 电子学报 , 2018 , 46 ( 3 ): 641 - 645 .

ZHANG T , LIU J H . A New indefinite reconstruction method for spatial data [J]. Acta Electronica Sinica , 2018 , 46 ( 3 ): 641 - 645 . (in Chinese)

TAKASHI Y , SHINGO M , KOTARO H , et al . Recurrent neural networks with multi-branch structure [J]. IEEJ Transactions on Electronics Information & Systems , 2007 , 127 ( 9 ): 1430 - 1435 .

CHEN Y , LI J , XIAO H , JIN X , YAN S , FENG J . Dual path networks [C]// Advances in Neural Information Processing Systems . Long Beach : MIT Press , 2017 : 4467 - 4475 .

NAIR V , HINTON G E . Rectified linear units improve restricted Boltzmann machines [C]// International Conference on Machine Learning . Haifa : ACM , 2010 : 807 - 814 .

HE K , GKIOXARI G , DOLL'AR P , GIRSHICK R B . Mask R-CNN [C]// 2017 IEEE International Conference on Computer Vision(ICCV) . Venice, Italy : IEEE . 2017 : 2980 - 2988 .

DAN H , THOMAS G D . Benchmarking neural network robustness to common corruptions and perturbations [C]// International Conference on Learning Representation . New Orleans, LA, USA : IEEE , 2019 .

LIN T , DOLL'AR P , GIRSHICK R B , HE K , HARIHARAN B , BELONGIE S J . Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, HI, USA : IEEE , 2017 : 936 - 944 .

HE K , ZHANG X , REN S , SUN J . Identity mappings in deep residual networks [C]// European Conference on Computer Vision . Amsterdam, Netherlands : Springer , 2016 : 630 - 645 .

SZEGEDY C , VANHOUCKE V , IOFFE S , SHLENS J , WOJNA Z . Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, NV, USA : IEEE , 2016 : 2818 - 2826 .

SANDLER M , HOWARD A G , ZHU M , ZHMOGINOV A , CHEN L . Mobilenetv2: Inverted residuals and linear bottlenecks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City, UT, USA : IEEE , 2018 : 4510 - 4520 .

SZEGEDY C , IOFFE S , VANHOUCKE V , ALEMI A A . Inception-v4, inception-resnet and the impact of residual connections on learning [C]// Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence . San Francisco, USA : AAAI Press , 2017 : 4278 - 4284 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Semi-Supervised Medical Image Segmentation Based on Suspicious Pixel Mutual Correction

Region-Guided and Dual Attention Discriminative Learning Network for Hyperspectral Target Detection

A Fast Malicious Code Detection Method Based on Feature Fusion

Multi-Level Description of Granules From an Outsider’s Perspective

Related Author

Su-lin ZHANG

Hong SHI

Peng-fei ZHU

Qi-long WANG

Qing-hua HU

Bang-gu WU

LEI Tao

YANG Zi-yao

Related Institution

Automotive Data of China Co.， Ltd

College of Intelligence and Computing， Tianjin University

Key Laboratory of Collaborative Intelligent Systems, Ministry of Education, Xidian University

School of Mathematics, Southwest Jiaotong University

Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology

⁰