

浏览全部资源
扫码关注微信
1.首都师范大学信息工程学院,北京 100048
2.首都师范大学数学科学学院,北京 100048
Received:06 September 2021,
Revised:2021-12-27,
Published:25 April 2023
移动端阅览
王莹,王晶,高岚等.一种注意力机制优化方法及硬件加速设计[J].电子学报,2023,51(04):1021-1029.
WANG Ying,WANG Jing,GAO Lan,et al.An Improved Attention Mechanism Algorithm Model and Hardware Aceleration Design Method[J].ACTA ELECTRONICA SINICA,2023,51(04):1021-1029.
王莹,王晶,高岚等.一种注意力机制优化方法及硬件加速设计[J].电子学报,2023,51(04):1021-1029. DOI: 10.12263/DZXB.20211229.
WANG Ying,WANG Jing,GAO Lan,et al.An Improved Attention Mechanism Algorithm Model and Hardware Aceleration Design Method[J].ACTA ELECTRONICA SINICA,2023,51(04):1021-1029. DOI: 10.12263/DZXB.20211229.
针对注意力机制在卷积神经网络的应用过程中无法避免的计算量增大、延迟增加问题,本文提出一种优化后的CBAM(Convolutional Block Attention Module)算法模型,并进行了硬件设计实现.论文基于传统CBAM模型结构,分析算法内部隐藏的潜在问题,设计更加符合注意力重要性参数提取初衷的算法模型;同时,通过计算过程优化,减少数据计算量、对算子进行最大并行组合;利用FPGA(Field Programmable Gate Array)可设计高效灵活并行阵列的优势,为改进后的CBAM算法设计一种硬件加速引擎结构.实验结果表明,与传统CBAM机制相比,改进后的注意力机制可以保持与原有算法模型几乎相同的精度,部署在FPGA的硬件加速计算引擎以180 MHz工作频率进行推理实验,经分析可得,本文提出的设计方案在同等硬件资源条件下,针对注意力机制电路可实现10.2%的计算速度提升,针对VGG16网络模型可实现4.5%的推理速度提升.
Aiming at the problem of increased calculation and delay that cannot be avoided in the application of convolutional neural network in the attention mechanism
this paper proposes an optimized CBAM (Convolutional Block Attention Module) algorithm model. Based on the traditional CBAM model structure
we analyze the hidden problems inside the algorithm
and design an algorithm model that is more fit for the original intention of attention importance parameter extraction; at the same time
through the optimization of the calculation process
the amount of data calculation is reduced
and the maximum parallel combination of operators is used; taking advantage of FPGA (Field Programmable Gate Array) to design efficient and flexible parallel arrays
we design a hardware acceleration engine structure for the improved CBAM algorithm. The experimental results show that compared with the traditional CBAM mechanism
the improved attention mechanism can maintain almost the same accuracy as the original algorithm model. The hardware accelerated computing engine deployed on the FPGA performs inference experiments at a working frequency of 180 MHz. After analysis
it can be found that the design proposed in this paper can achieve a 10.2% increase in calculation speed for the attention mechanism circuit and a 4.5% increase in inference speed for the VGG16 network model with the same hardware resources.
HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .
LI X , WANG W H , HU X L , et al . Selective kernel networks [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 510 - 519 .
SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vision [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 2818 - 2826 .
WOO S , PARK J , LEE J Y , et al . CBAM: Convolutional block attention module [C ] // Computer Vision-ECCV 2018 . Cham : Springer International Publishing , 2018 : 3 - 19 .
PARK J , WOO S , LEE J Y , et al . BAM: Bottleneck attention module [EB/OL ] . ( 2018-07-17 )[ 2021-09 ] . https://arxiv.org/abs/1807.06514 https://arxiv.org/abs/1807.06514 .
GAO G S , LIU Q J , WANG Y H . Counting dense objects in remote sensing images [EB/OL ] . ( 2020-02-14 )[ 2021-09 ] . https://arxiv.org/abs/2002.05928 https://arxiv.org/abs/2002.05928 .
乔思波 , 庞善臣 , 王敏 , 等 . 基于残差混合注意力机制的脑部CT图像分类卷积神经网络模型 [J ] . 电子学报 , 2021 , 49 ( 5 ): 984 - 991 .
QIAO S B , PANG S C , WANG M , et al . A convolutional neural network for brain CT image classification based on residual hybrid attention mechanism [J ] . Acta Electronica Sinica , 2021 , 49 ( 5 ): 984 - 991 . (in Chinese)
WANG X L , GIRSHICK R , GUPTA A , et al . Non-local neural networks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7794 - 7803 .
Vaswani A , Shazeer N , Parmar N , et al . Attention is all you need [EB/OL ] . ( 2017-06-12 )[ 2021-09 ] . https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762 .
BAHDANAU D , CHO K , BENGIO Y . Neural machine translation by jointly learning to align and translate [EB/OL ] . ( 2014-09-01 )[ 2021-09 ] . https://arxiv.org/abs/1409.0473 https://arxiv.org/abs/1409.0473 .
GEHRING J , AULI M , GRANGIER D , et al . Convolutional sequence to sequence learning [EB/OL ] . ( 2017-05-08 )[ 2021-09 ] . https://arxiv.org/abs/1705.03122 https://arxiv.org/abs/1705.03122 .
HAM T J , JUNG S J , KIM S , et al . A 3 : Accelerating attention mechanisms in neural networks with approximation [C ] // 2020 IEEE International Symposium on High Performance Computer Architecture . Piscataway : IEEE , 2020: 328 - 341 .
WANG H R , ZHANG Z K , HAN S . SpAtten: Efficient sparse attention architecture with cascade token and head pruning [EB/OL ] . ( 2020-12-17 )[ 2021-09 ] . https://arxiv.org/abs/2012.09852 https://arxiv.org/abs/2012.09852 .
HAN Y Z , HUANG G , SONG S J , et al . Dynamic neural networks: A survey [EB/OL ] . ( 2021-02-09 )[ 2021-09 ] . https://arxiv.org/abs/2102.04906 https://arxiv.org/abs/2102.04906 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .
HE K M , ZHANG X Y , REN S Q , et al . Identity mappings in deep residual networks [C ] // European Conference on Computer Vision . Cham : Springer , 2016 : 630 - 645 .
XIE S N , GIRSHICK R , DOLLÁR P , et al . Aggregated residual transformations for deep neural networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 5987 - 5995 .
TOLSTIKHIN I , HOULSBY N , KOLESNIKOV A , et al . MLP-mixer: An all-MLP architecture for vision [EB/OL ] . ( 2021-05-04 )[ 2021-09 ] . https://arxiv.org/abs/2105.01601 https://arxiv.org/abs/2105.01601 .
GUO M H , LIU Z N , MU T J , et al . Beyond self-attention: External attention using two linear layers for visual tasks [EB/OL ] . ( 2021-05-05 )[ 2021-09 ] . https://arxiv.org/abs/2105.02358 https://arxiv.org/abs/2105.02358 .
DING X H , XIA C L , ZHANG X Y , et al . RepMLP: Re-parameterizing convolutions into fully-connected layers for image recognition [EB/OL ] . ( 2021-05-05 )[ 2021-09 ] . https://arxiv.org/abs/2105.01883 https://arxiv.org/abs/2105.01883 .
刘杰 , 葛一凡 , 田明 , 等 . 基于ZYNQ的可重构卷积神经网络加速器 [J ] . 电子学报 , 2021 , 49 ( 4 ): 729 - 735 .
LIU J , GE Y F , TIAN M , et al . Reconfigurable convolutional network accelerator based on ZYNQ [J ] . Acta Electronica Sinica , 2021 , 49 ( 4 ): 729 - 735 . (in Chinese)
乔瑞秀 , 陈刚 , 龚国良 , 等 . 一种高性能可重构深度卷积神经网络加速器 [J ] . 西安电子科技大学学报 , 2019 , 46 ( 3 ): 130 - 139 .
QIAO R X , CHEN G , GONG G L , et al . High performance reconfigurable accelerator for deep convolutional neural networks [J ] . Journal of Xidian University , 2019 , 46 ( 3 ): 130 - 139 . (in Chinese)
蹇强 , 张培勇 , 王雪洁 . 一种可配置的CNN协加速器的FPGA实现方法 [J ] . 电子学报 , 2019 , 47 ( 7 ): 1525 - 1531 .
JIAN Q , ZHANG P Y , WANG X J . An FPGA implementation method for configurable CNN co-accelerator [J ] . Acta Electronica Sinica , 2019 , 47 ( 7 ): 1525 - 1531 . (in Chinese)
PENG X Y , YU J X , YAO B W , et al . A review of FPGA-based custom computing architecture for convolutional neural network inference [J ] . Chinese Journal of Electronics , 2021 , 30 ( 1 ): 1 - 17 .
CHEN Y H , KRISHNA T , EMER J , et al . 14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks [C ] // 2016 IEEE International Solid-State Circuits Conference . Piscataway : IEEE , 2016 : 262 - 263 .
CHEN Y H , YANG T J , EMER J , et al . Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices [J ] . IEEE Journal on Emerging and Selected Topics in Circuits and Systems , 2019 , 9 ( 2 ): 292 - 308 .
赵博雅 . 基于卷积神经网络的硬件加速器设计及实现研究 [D ] . 哈尔滨 : 哈尔滨工业大学 , 2018 .
ZHAO B Y . Study on Design and Implementation of Hardware Accelerators Based on Convolutional Neural Networks [D ] . Harbin : Harbin Institute of Technology , 2018 . (in Chinese)
SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] .( 2014-09-14 )[ 2021-09 ] . https://arxiv.org/abs/1409.1556 https://arxiv.org/abs/1409.1556 .
0
Views
12
下载量
3
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621