电子学报 ›› 2020, Vol. 48 ›› Issue (12): 2360-2366.DOI: 10.3969/j.issn.0372-2112.2020.12.011

• 学术论文 • 上一篇    下一篇

基于细化多尺度深度特征的目标检测网络

李雅倩, 盖成远, 肖存军, 吴超, 刘佳甲   

  1. 燕山大学工业计算机控制工程河北省重点实验室, 河北秦皇岛 066004
  • 收稿日期:2019-10-11 修回日期:2020-06-05 出版日期:2020-12-25
    • 通讯作者:
    • 李雅倩
    • 作者简介:
    • 盖成远 男,1993年11月出生,黑龙江大兴安岭人.燕山大学硕士研究生,主要研究方向为:深度学习与模式识别.E-mail:gaichengyuan@126.com

Object Detection Networks Based on Refined Multi-scale Depth Feature

LI Ya-qian, GAI Cheng-yuan, XIAO Cun-jun, WU Chao, LIU Jia-jia   

  1. 1. Key Lab of Industrial Computer Control Engineering of Hebei Province, Yanshan University, Qinhuangdao, Hebei 066004, China
  • Received:2019-10-11 Revised:2020-06-05 Online:2020-12-25 Published:2020-12-25
    • Corresponding author:
    • LI Ya-qian

摘要: 现有深度卷积神经网络中感受野尺度单一,无法适应目标的尺度变化和边界形变,故此本文提出了一种提取并融合多尺度特征的目标检测网络.该网络通过减少池化并在网络底层加入空间加信道压缩激励模块来突出可利用的细节信息,生成高质量的特征图;此外,在深层网络中加入可变多尺度特征融合模块,该模块具有多种尺度的感受野并可根据物体边界预测采样位置,最后通过融合多尺度特征使网络具有更强的特征表达能力并且对不同尺度实例及其边界信息更具鲁棒性.实验证明,本文结构实现了比原有结构更高的平均精度,与目前主流目标检测算法相比也具有一定优势.

关键词: 目标检测, 特征金字塔网络, 可变形卷积, 信道空间压缩激励, 多尺度特征融合

Abstract: In the existing deep convolution neural network, the scale of receptive field is single, which could not adapt to the scale change and boundary deformation of the target. Therefore, a target detection network based on multi-scale feature extraction and feature fusion is proposed in this paper. The proposed network reduces pooling and adds space as well as channel compression excitation module at the bottom of the network to highlight the details and generate high-quality feature map. Besides, a variable multi-scale feature fusion module is added to the deep network, which has a multi-scale receptive field and can predict the position according to object boundary. Finally, the multi-scale feature fusion is used to enable the network of stronger ability of feature expression and is more robust to different scale and flexible boundary of instances. Experimental results show that the proposed structure achieves higher average accuracy than the original structure, and also has certain advantages compared with the state-of-the-art algorithms.

Key words: object detection, feature pyramid network, deformable convolution, channel spatial squeeze excitation, multi-scale feature fusion

中图分类号: