基于级联式逆残差网络的遥感图像轻量目标检测算法

陈立; 张帆; 郭威; 黄赟; 李继中

doi:10.12263/DZXB.20210831

您当前的位置：

首页 >

文章列表页 >

基于级联式逆残差网络的遥感图像轻量目标检测算法

学术论文 | 更新时间：2025-12-08

- 基于级联式逆残差网络的遥感图像轻量目标检测算法
- Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image
- 电子学报 2023年51卷第9期页码：2588-2597
- 作者机构：
  
  1.信息工程大学, 河南郑州 450001
  2.国家数字交换系统工程技术研究中心,河南郑州 450002
  3.郑州战略投送基地,河南郑州 450002
- 作者简介：
  
  [ "陈立男，1997年2月出生于浙江省义乌市. 信息工程大学硕士生. 主要研究方向为计算机视觉.E-mail: 2464863136@qq.com" ]
  [ "张帆（通讯作者）男，1981年9月出生. 博士. 现为国家数字交换系统工程技术研究中心副研究员、硕士生导师. 主要研究方向为主动防御、人工智能、高性能计算.中国电子学会会员编号：E190013697M." ]
  [ "郭威男，1990年8月出生. 博士. 现为国家数字交换系统工程技术研究中心助理研究员. 主要研究方向为主动防御、人工智能、高性能计算.中国电子学会会员编号：E190029991M. E-mail: guowjss@126.com" ]
  [ "黄赟男，1993年9月出生于江西省新余市. 信息工程大学硕士生. 主要研究方向为神经网络模型量化压缩、网络内生安全. E-mail: yyhuangz@163.com" ]
- 基金信息：
  
  自然科学基金创新研究群体项目(61521003)
- DOI：10.12263/DZXB.20210831
  中图分类号： TP391
- 收稿：2021-07-02，
  
  修回：2021-10-15，
  
  纸质出版：2023-09-25
- 稿件说明：
移动端阅览
陈立,张帆,郭威等.基于级联式逆残差网络的遥感图像轻量目标检测算法[J].电子学报,2023,51(09):2588-2597.

CHEN Li,ZHANG Fan,GUO Wei,et al.Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image[J].ACTA ELECTRONICA SINICA,2023,51(09):2588-2597.
陈立,张帆,郭威等.基于级联式逆残差网络的遥感图像轻量目标检测算法[J].电子学报,2023,51(09):2588-2597. DOI： 10.12263/DZXB.20210831.

CHEN Li,ZHANG Fan,GUO Wei,et al.Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image[J].ACTA ELECTRONICA SINICA,2023,51(09):2588-2597. DOI： 10.12263/DZXB.20210831.

摘要

遥感场景下的高实时目标检测任务具有重要的研究价值与应用意义. 针对当前遥感图像目标检测模型由于目标多角度、排列密集以及背景复杂从而导致检测速度慢的问题，提出一种级联式逆残差卷积结构（Cascaded Inverted Residual Convolution， CIRC）. 该结构采用深度可分离卷积作为基本卷积单元，快速提升模型计算能力；在此基础上，通过转置通道矩阵与级联深度卷积，并增加残差连接层数，达到强化目标多维特征的目的；进一步，进行多级模块堆叠，提高模型对目标的检测效果. 本文在RetinaNet基础上，利用CIRC设计了一个快速的轻量化目标检测网络—CIRCN（Cascaded Inverted Residual Convolution Net）. 同时，在训练阶段引入角度变量并参与反向传播，在推理阶段对水平框加入角度偏置，有效提高定向目标与检测框匹配度. 在DOTA数据集上的实验结果表明， CIRCN在精度略受损失的情况下，检测速度达到42 fps，比基准算法提高了3.5倍. 结果验证了所提算法的有效性与可靠性.

Abstract

The task of high real-time object detection in remote sensing scenes has important research value and application significance. Aiming at the slow detection speed of the current remote sensing image target detection model due to multiple angles

dense arrangement and complex background

a cascaded inverted residual convolution (CIRC) is proposed. This structure uses depthwise separable convolution as the basic convolution unit to quickly improve the model's computing power. On this basis

the multi-dimensional features of the object are enhanced by transposing the channel matrix with cascaded depth convolution and increasing the number of residual connection layers. Further

multi-level module stacking is carried out to improve the detection effect of the model on the object. Based on RetinaNet

this paper uses CIRC to design a fast lightweight object detection network—CIRCN (Cascaded Inverted Residual Convolution Net). At the same time

the angle variable is introduced in the training phase and participates in back propagation

and the angle offset is added to the horizontal frame in the inference phase

which effectively improves the matching degree of the directional target and the detection frame. The experimental results on the DOTA dataset show that the detection speed of CIRCN reaches 42 fps with a slight loss of accuracy

which is 3.5 times higher than the benchmark algorithm. The results verify the effectiveness and reliability of the proposed algorithm.

关键词

Keywords

references

罗会兰 , 陈鸿坤 . 基于深度学习的目标检测研究综述 [J]. 电子学报 , 2020 , 48 ( 6 ): 1230 - 1239 .

LUO H L , CHEN H K . Survey of object detection based on deep learning [J]. Acta Electronica Sinica , 2020 , 48 ( 6 ): 1230 - 1239 . (in Chinese)

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

冀大雄 , 方文巍 , 朱华 , 等 . 基于相对测量的水下机器人主动定位方法研究 [J]. 电子学报 , 2021 , 49 ( 7 ): 1249 - 1256 .

JI D X , FANG W W , ZHU H , et al . Active localization of autonomous underwater vehicle using noisy relative measurement [J]. Acta Electronica Sinica , 2021 , 49 ( 7 ): 1249 - 1256 . (in Chinese)

徐频捷 , 陈逸杰 , 李之南 , 等 . 基于事件驱动的车道线识别算法研究 [J]. 电子学报 , 2021 , 49 ( 7 ): 1379 - 1385 .

XU P J , CHEN Y J , LI Z N , et al . Research on event-driven lane recognition algorithms [J]. Acta Electronica Sinica , 2021 , 49 ( 7 ): 1379 - 1385 . (in Chinese)

李倩玉 , 蒋建国 , 齐美彬 . 基于改进深层网络的人脸识别算法 [J]. 电子学报 , 2017 , 45 ( 3 ): 619 - 625 .

LI Q Y , JIANG J G , QI M B . Face recognition algorithm based on improved deep networks [J]. Acta Electronica Sinica , 2017 , 45 ( 3 ): 619 - 625 . (in Chinese)

OKSUZ K , CAM B C , KALKAN S , et al . Imbalance problems in object detection: A review [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 10 ): 3388 - 3415 .

REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: Unified, real-time object detection [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 779 - 788 .

LIU W , ANGUELOV D , ERHAN D , et al . SSD: Single shot multibox detector [C]// ECCV 2016 . Cham : Springer International Publishing , 2016 : 21 - 37 .

LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 936 - 944 .

LIU S , QI L , QIN H F , et al . Path aggregation network for instance segmentation [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 8759 - 8768 .

GUO C X , FAN B , ZHANG Q , et al . AugFPN: Improving multi-scale feature learning for object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 12592 - 12601 .

TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [EB/OL]. ( 2019-05-28 )[ 2021-06-01 ]. https://arxiv.org/abs/1905.11946 https://arxiv.org/abs/1905.11946 .

MA J Q , SHAO W Y , YE H , et al . Arbitrary-oriented scene text detection via rotation proposals [J]. IEEE Transactions on Multimedia , 2018 , 20 ( 11 ): 3111 - 3122 .

ZHOU X Y , YAO C , WEN H , et al . EAST: An efficient and accurate scene text detector [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 2642 - 2651 .

JIANG Y Y , ZHU X Y , WANG X B , et al . R 2 CNN: Rotational region CNN for orientation robust scene text detection [EB/OL]. ( 2017-06-29 )[ 2021-06-01 ]. https://arxiv.org/abs/1706. 09579 https://arxiv.org/abs/1706.09579 .

YANG X , YAN J C , FENG Z M , et al . R3Det: Refined single-stage detector with feature refinement for rotating object [EB/OL]. ( 2019-08-15 )[ 2021-06-01 ]. https://arxiv.org/abs/1908. 05612 https://arxiv.org/abs/1908.05612 .

DING J , XUE N , LONG Y , et al . Learning RoI transformer for detecting oriented objects in aerial images [EB/OL]. ( 2018-12-01 )[ 2021-06-01 ]. https://arxiv.org/abs/1812.00155 https://arxiv.org/abs/1812.00155 .

LIN T Y , GOYAL P , GIRSHICK R , et al . Focal loss for dense object detection [C]// 2017 IEEE International Conference on Computer Vision . Piscataway : IEEE , 2017 : 2999 - 3007 .

ALBAWI S , MOHAMMED T A , AL-ZAWI S . Understanding of a convolutional neural network [C]// 2017 International Conference on Engineering and Technology (ICET) . Piscataway : IEEE , 2017 : 1 - 6 .

IANDOLA F N , HAN S , MOSKEWICZ M W , et al . SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0 . 5 MB model size[EB/OL]. ( 2016-02-24 )[ 2021-06-01 ]. https://arxiv.org/abs/1602.07360 https://arxiv.org/abs/1602.07360 .

HOWARD A G , ZHU M L , CHEN B , et al . MobileNets: Efficient convolutional neural networks for mobile vision applications [EB/OL]. ( 2017-04-17 )[ 2021-06-01 ]. https://arxiv.org/abs/1704.04861 https://arxiv.org/abs/1704.04861 .

SIFRE L , MALLAT S . Rigid-motion scattering for texture classification [EB/OL]. ( 2014-03-07 )[ 2021-06-01 ]. https://arxiv.org/abs/1403.1687 https://arxiv.org/abs/1403.1687 .

AGARAP A F . Deep learning using rectified linear units (ReLU) [EB/OL]. ( 2018-03-22 )[ 2021-06-01 ]. https://arxiv.org/abs/1803. 08375 https://arxiv.org/abs/1803.08375 .

SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: Inverted residuals and linear bottlenecks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4510 - 4520 .

HOWARD A , SANDLER M , CHEN B , et al . Searching for MobileNetV3 [C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 1314 - 1324 .

TAN M X , CHEN B , PANG R M , et al . MnasNet: Platform-aware neural architecture search for mobile [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 2815 - 2823 .

ZHANG X Y , ZHOU X Y , LIN M X , et al . ShuffleNet: An extremely efficient convolutional neural network for mobile devices [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 6848 - 6856 .

WANG R J , LI X , LING C X . Pelee: A real-time object detection system on mobile devices [EB/OL]. ( 2018-04-18 )[ 2021-06-01 ]. https://arxiv.org/abs/1804.06882 https://arxiv.org/abs/1804.06882 .

HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1577 - 1586 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .

DAQUAN Z , HOU Q B , CHEN Y P , et al . Rethinking bottleneck structure for efficient mobile network design [EB/OL]. ( 2020-07-05 )[ 2021-06-01 ]. https://arxiv.org/abs/ 2007.02269 https://arxiv.org/abs/2007.02269 .

XIA G S , BAI X , DING J , et al . DOTA: A large-scale dataset for object detection in aerial images [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 3974 - 3983 .

REDMON J , FARHADI A . YOLOv3: An incremental improvement [EB/OL]. ( 2018-04-08 )[ 2021-06-01 ]. https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

面向遥感图像的多阶段特征融合目标检测方法

AI-DETR：自适应加权的可解释目标检测方法

基于因果提示蒸馏的开放世界目标检测

基于ICFIE-YOLO的低照度图像目标检测方法