1.浙江理工大学计算机科学与技术学院(人工智能学院),浙江杭州 310018
2.浙江理工大学理学院,浙江杭州 310018
3.浙江广厦建设职业技术大学,浙江东阳 322100
[ "张娜 女,1977年生,浙江杭州人.硕士,副教授.主要研究方向为智能信息处理. E-mail: zhangna@zstu.edu.cn" ]
[ "包梓群 男,2001年生,浙江东阳人.浙江理工大学本科生.主要研究方向为图像处理、智能信息处理.E-mail: 359020134@qq.com" ]
[ "罗源 男,1995年生,湖北安陆人.硕士研究生.主要研究方向为图像处理、智能信息处理.E-mail: 993807182@qq.com" ]
[ "吴彪 男,1989年生,浙江杭州人.博士.主要研究方向为计算机视觉与模式识别.E-mail: biaowuzg@zstu.edu.cn" ]
[ "涂小妹 女,1995年生,湖北黄冈人.硕士.主要研究方向为图像处理、智能信息处理.E-mail: txm_95@163.com" ]
收稿:2022-01-25,
修回:2022-07-19,
纸质出版:2023-04-25
移动端阅览
张娜,包梓群,罗源等.改进的Cascade R-CNN算法在目标检测上的应用[J].电子学报,2023,51(04):896-906.
ZHANG Na,BAO Zi-qun,LUO Yuan,et al.Application of Improved Cascade R-CNN Algorithm in Target Detection[J].ACTA ELECTRONICA SINICA,2023,51(04):896-906.
张娜,包梓群,罗源等.改进的Cascade R-CNN算法在目标检测上的应用[J].电子学报,2023,51(04):896-906. DOI: 10.12263/DZXB.20220116.
ZHANG Na,BAO Zi-qun,LUO Yuan,et al.Application of Improved Cascade R-CNN Algorithm in Target Detection[J].ACTA ELECTRONICA SINICA,2023,51(04):896-906. DOI: 10.12263/DZXB.20220116.
针对Cascade R-CNN目标检测算法中存在检测精度较低以及目标遮挡问题,本文提出一种改进的Cascade R-CNN网络目标检测算法.该算法在主干网络ResNet101中引入可切换空洞卷积模块(Switchable Atrous Convolution,SAC),该模块主要由两个全局上下文模块以及SAC组件构成,采用SAC组件以不同的空洞卷积率对特征进行卷积,并使用Switch函数收集特征来提高特征提取能力.同时,在ResNet101残差网络中引入坐标注意力机制(Coordinate Attention,CA),该机制将位置信息嵌入通道注意力中,用于更好地获取方向感知和位置感知信息,进而提高目标检测精度.此外,针对目标遮挡问题,引入Repulsion Loss损失函数.该损失函数主要由吸引项和排斥项组成,吸引项使得预测框和匹配上的目标框尽可能接近,排斥项使得预测框远离错误目标,进而减少非极大值抑制(Non-Maximum Suppression,NMS)的误检,提高目标检测中遮挡问题的检测精度.实验结果表明,在公开的科大讯飞AI挑战赛数据集上,与原算法测试性能相比,改进的Cascade R-CNN网络对该数据集检出率增长了2.39%,改进算法的识别精度有一定的提高.
An improved target detection algorithm based on Cascade R-CNN network is proposed to solve the problems of low detection accuracy and target occlusion in the target detection algorithm based on Cascade R-CNN. The algorithm introduces a switchable atrous convolution (SAC) module into the backbone ResNet101
which is composed of two global context modules and SAC components. The SAC component is used to convolution the features with different void convolution rates
and the Switch function is used to collect the features to improve the ability of feature extraction. At the same time
coordinate attention (CA) is introduced in ResNet101 residual network
which embeds position information into channel attention
and is used to obtain direction and position information better to improve the accuracy of target detection. In addition
aiming at the problem of target occlusion
this paper introduces the repulsion loss function
which is mainly composed of the attraction term and the exclusion term. The attraction term makes the prediction box and the target box on the matching as close as possible
and the exclusion term makes the prediction box away from the wrong target
thereby reducing the false detection of non-maximum suppression (NMS) and improving the detection accuracy of the occlusion problem in object detection. The experimental results show that the detection rate of the improved Cascade R-CNN network is 2.39% higher than that of the original algorithm on the open IFLYTEK Challenge dataset
the recognition accuracy of the improved algorithm is improved to a certain extent.
LIU W , ANGUELOV D , ERHAN D , et al . SSD: Single shot multibox detector [C ] // European Conference on Computer Vision - ECCV 2016 . Amsterdam : Springer , 2016 : 21 - 37 .
辛文斌 , 郝惠敏 , 卜明龙 , 等 . 基于ShuffleNetv2-YOLOv3模型的静态手势实时识别方法 [J ] . 浙江大学学报(工学版) , 2021 , 55 ( 10 ): 1815 - 1824, 1846 .
XIN W B , HAO H M , BU M L , et al . Static gesture real-time recognition method based on ShuffleNetv2-YOLOv3 model [J ] . Journal of Zhejiang University (Engineering Science) , 2021 , 55 ( 10 ): 1815 - 1824, 1846 . (in Chinese)
曲优 , 李文辉 . 基于锚框变换的单阶段旋转目标检测方法 [J ] . 吉林大学学报(工学版) , 2022 , 52 ( 1 ): 162 - 173 .
QU Y , LI W H . Single-stage rotated object detection network based on anchor transformation [J ] . Journal of Jilin University (Engineering and Technology Edition) , 2022 , 52 ( 1 ): 162 - 173 . (in Chinese)
候少麒 , 梁杰 , 殷康宁 , 等 . 基于空洞卷积金字塔的目标检测算法 [J ] . 电子科技大学学报 , 2021 , 50 ( 6 ): 843 - 851 .
HOU S Q , LIANG J , YIN K N , et al . Object detection algorithm based on atrous convolutional pyramid [J ] . Journal of University of Electronic Science and Technology of China , 2021 , 50 ( 6 ): 843 - 851 . (in Chinese)
张云佐 , 李文博 , 郑婷婷 . 基于LGC的反残差目标检测算法 [J/OL ] . 北京航空航天大学学报 , 2021 . DOI: 10.13700/j.bh.1001-5965.2021.0452 http://dx.doi.org/10.13700/j.bh.1001-5965.2021.0452 .
ZHANG Y Z , LI W B , ZHENG T T . An Inverse Residual Object Detection Algorithm Based on LGC [J/OL ] . Journal of Beijing University of Aeronautics and Astronautics , 2021 . DOI: 10.13700/j.bh.1001-5965.2021.0452. http://dx.doi.org/10.13700/j.bh.1001-5965.2021.0452. (in Chinese)
ZHAO Q , LI B Q , LI T W . Target detection algorithm based on improved YOLOv3 [J ] . Laser & Optoelectronics Progress , 2020 , 57 ( 12 ): 121502 .
ZHAI S P , SHANG D R , WANG S H , et al . DF-SSD: An improved SSD object detection algorithm based on DenseNet and feature fusion [J ] . IEEE Access , 2020 , 8 : 24344 - 24357 .
IANDOLA F , MOSKEWICZ M , KARAYEV S , et al . DenseNet: Implementing efficient ConvNet descriptor pyramids [EB/OL ] . ( 2014-04-07 )[ 2022-01 ] . https://arxiv.org/abs/1404.1869 https://arxiv.org/abs/1404.1869 .
CHEN X , GUPTA A . An implementation of faster RCNN with study for region sampling [EB/OL ] . ( 2017-02-07 )[ 2022-01 ] . https://arxiv.org/abs/1702.02138 https://arxiv.org/abs/1702.02138 .
LIU Y . An improved faster R-CNN for object detection [C ] // 2018 11th International Symposium on Computational Intelligence and Design (ISCID) . Hangzhou : IEEE , 2019 : 119 - 123 .
REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: Towards real-time object detection with region proposal networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 .
张烨 , 许艇 , 冯定忠 , 等 . 基于难分样本挖掘的快速区域卷积神经网络目标检测研究 [J ] . 电子与信息学报 , 2019 , 41 ( 6 ): 1496 - 1502 .
ZHANG Y , XU T , FENG D Z , et al . Research on faster RCNN object detection based on hard example mining [J ] . Journal of Electronics & Information Technology , 2019 , 41 ( 6 ): 1496 - 1502 . (in Chinese)
李晓光 , 付陈平 , 李晓莉 , 等 . 面向多尺度目标检测的改进Faster R-CNN算法 [J ] . 计算机辅助设计与图形学学报 , 2019 , 31 ( 7 ): 1095 - 1101 .
LI X G , FU C P , LI X L , et al . Improved faster R-CNN for multi-scale object detection [J ] . Journal of Computer-Aided Design & Computer Graphics , 2019 , 31 ( 7 ): 1095 - 1101 . (in Chinese)
陈科圻 , 朱志亮 , 邓小明 , 等 . 多尺度目标检测的深度学习研究综述 [J ] . 软件学报 , 2021 , 32 ( 4 ): 1201 - 1227 .
CHEN K Q , ZHU Z L , DENG X M , et al . Deep learning for multi-scale object detection: A survey [J ] . Journal of Software , 2021 , 32 ( 4 ): 1201 - 1227 . (in Chinese)
CAI Z W , VASCONCELOS N . Cascade R-CNN: Delving into high quality object detection [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 6154 - 6162 .
陈莹 , 龚苏明 . 改进通道注意力机制下的人体行为识别网络 [J ] . 电子与信息学报 , 2021 , 43 ( 12 ): 3538 - 3545 .
CHEN Y , GONG S M . Human action recognition network based on improved channel attention mechanism [J ] . Journal of Electronics & Information Technology , 2021 , 43 ( 12 ): 3538 - 3545 . (in Chinese)
QIAO S Y , CHEN L C , YUILLE A . DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Nashville : IEEE , 2021 : 10208 - 10219 .
SENHAJI K , RAMCHOUN H , ETTAOUIL M . Training feedforward neural network via multiobjective optimization model using non-smooth L1/2 regularization [J ] . Neurocomputing , 2020 , 410 : 1 - 11 .
WANG X L , XIAO T T , JIANG Y N , et al . Repulsion loss: Detecting pedestrians in a crowd [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 7774 - 7783 .
孟子尧 , 陈斯佳 , 吕天予 , 等 . 基于深度学习的肾小球病理图像识别与分类 [J ] . 计算机辅助设计与图形学学报 , 2021 , 33 ( 6 ): 947 - 955 .
MENG Z Y , CHEN S J , LYU T Y , et al . Recognition and classification of glomerular pathological images based on deep learning [J ] . Journal of Computer-Aided Design & Computer Graphics , 2021 , 33 ( 6 ): 947 - 955 . (in Chinese)
谭建豪 , 郑英帅 , 王耀南 , 等 . 基于中心点搜索的无锚框全卷积孪生跟踪器 [J ] . 自动化学报 , 2021 , 47 ( 4 ): 801 - 812 .
TAN J H , ZHENG Y S , WANG Y N , et al . AFST: Anchor-free fully convolutional Siamese tracker with searching center point [J ] . Acta Automatica Sinica , 2021 , 47 ( 4 ): 801 - 812 . (in Chinese)
朱煜 , 方观寿 , 郑兵兵 , 等 . 基于旋转框精细定位的遥感目标检测方法研究 [J ] . 自动化学报 , 2023 , 49 ( 2 ): 415 - 424 .
ZHU Y , FANG G S , ZHENG B B , et al . Research on remote sensing target detection method based on fine positioning of rotating frame [J ] . Acta Automatica Sinica , 2023 , 49 ( 2 ): 415 - 424 .
赵琰 , 赵凌君 , 匡纲要 . 基于注意力机制特征融合网络的SAR图像飞机目标快速检测 [J ] . 电子学报 , 2021 , 49 ( 9 ): 1665 - 1674 .
ZHAO Y , ZHAO L J , KUANG G Y . Attention feature fusion network for rapid aircraft detection in SAR images [J ] . Acta Electronica Sinica , 2021 , 49 ( 9 ): 1665 - 1674 . (in Chinese)
姜正申 , 刘宏志 , 付彬 , 等 . 集成学习的泛化误差和AUC分解理论及其在权重优化中的应用 [J ] . 计算机学报 , 2019 , 42 ( 1 ): 1 - 15 .
JIANG Z S , LIU H Z , FU B , et al . Decomposition theories of generalization error and AUC in ensemble learning with application in weight optimization [J ] . Chinese Journal of Computers , 2019 , 42 ( 1 ): 1 - 15 . (in Chinese)
李书和 , 张奕群 , 王东升 , 等 . 数控机床热误差的建模与预补偿 [J ] . 计量学报 , 1999 ( 1 ): 49 - 52 .
LI S H , ZHANG Y Q , WANG D S , et al . Modeling and precompensation of thermal errors in CNC machine tools [J ] . Acta Metrologica Sinica , 1999 ( 1 ): 49 - 52 . (in Chinese)
0
浏览量
26
下载量
1
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621