Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image

CHEN Li; ZHANG Fan; GUO Wei; HUANG Yun; LI Ji-zhong

doi:10.12263/DZXB.20210831

您当前的位置：

首页 >

文章列表页 >

Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image

PAPERS | 更新时间：2025-12-08

- Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image
- ACTA ELECTRONICA SINICA Vol. 51, Issue 9, Pages: 2588-2597(2023)
- 作者机构：
  
  1.信息工程大学, 河南郑州 450001
  2.国家数字交换系统工程技术研究中心,河南郑州 450002
  3.郑州战略投送基地,河南郑州 450002
- 作者简介：
- 基金信息：
- DOI：10.12263/DZXB.20210831
  CLC： TP391
- Received：02 July 2021，
  
  Revised：2021-10-15，
  
  Published：25 September 2023
- 稿件说明：
移动端阅览
陈立,张帆,郭威等.基于级联式逆残差网络的遥感图像轻量目标检测算法[J].电子学报,2023,51(09):2588-2597.

CHEN Li,ZHANG Fan,GUO Wei,et al.Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image[J].ACTA ELECTRONICA SINICA,2023,51(09):2588-2597.
陈立,张帆,郭威等.基于级联式逆残差网络的遥感图像轻量目标检测算法[J].电子学报,2023,51(09):2588-2597. DOI： 10.12263/DZXB.20210831.

CHEN Li,ZHANG Fan,GUO Wei,et al.Cascaded Inverse Residual Network for Lightweight Object Detection Model in Remote Sensing Image[J].ACTA ELECTRONICA SINICA,2023,51(09):2588-2597. DOI： 10.12263/DZXB.20210831.

摘要

遥感场景下的高实时目标检测任务具有重要的研究价值与应用意义. 针对当前遥感图像目标检测模型由于目标多角度、排列密集以及背景复杂从而导致检测速度慢的问题，提出一种级联式逆残差卷积结构（Cascaded Inverted Residual Convolution， CIRC）. 该结构采用深度可分离卷积作为基本卷积单元，快速提升模型计算能力；在此基础上，通过转置通道矩阵与级联深度卷积，并增加残差连接层数，达到强化目标多维特征的目的；进一步，进行多级模块堆叠，提高模型对目标的检测效果. 本文在RetinaNet基础上，利用CIRC设计了一个快速的轻量化目标检测网络—CIRCN（Cascaded Inverted Residual Convolution Net）. 同时，在训练阶段引入角度变量并参与反向传播，在推理阶段对水平框加入角度偏置，有效提高定向目标与检测框匹配度. 在DOTA数据集上的实验结果表明， CIRCN在精度略受损失的情况下，检测速度达到42 fps，比基准算法提高了3.5倍. 结果验证了所提算法的有效性与可靠性.

Abstract

The task of high real-time object detection in remote sensing scenes has important research value and application significance. Aiming at the slow detection speed of the current remote sensing image target detection model due to multiple angles

dense arrangement and complex background

a cascaded inverted residual convolution (CIRC) is proposed. This structure uses depthwise separable convolution as the basic convolution unit to quickly improve the model's computing power. On this basis

the multi-dimensional features of the object are enhanced by transposing the channel matrix with cascaded depth convolution and increasing the number of residual connection layers. Further

multi-level module stacking is carried out to improve the detection effect of the model on the object. Based on RetinaNet

this paper uses CIRC to design a fast lightweight object detection network—CIRCN (Cascaded Inverted Residual Convolution Net). At the same time

the angle variable is introduced in the training phase and participates in back propagation

and the angle offset is added to the horizontal frame in the inference phase

which effectively improves the matching degree of the directional target and the detection frame. The experimental results on the DOTA dataset show that the detection speed of CIRCN reaches 42 fps with a slight loss of accuracy

which is 3.5 times higher than the benchmark algorithm. The results verify the effectiveness and reliability of the proposed algorithm.

关键词

Keywords

references

罗会兰 , 陈鸿坤 . 基于深度学习的目标检测研究综述 [J]. 电子学报 , 2020 , 48 ( 6 ): 1230 - 1239 .

LUO H L , CHEN H K . Survey of object detection based on deep learning [J]. Acta Electronica Sinica , 2020 , 48 ( 6 ): 1230 - 1239 . (in Chinese)

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

冀大雄 , 方文巍 , 朱华 , 等 . 基于相对测量的水下机器人主动定位方法研究 [J]. 电子学报 , 2021 , 49 ( 7 ): 1249 - 1256 .

JI D X , FANG W W , ZHU H , et al . Active localization of autonomous underwater vehicle using noisy relative measurement [J]. Acta Electronica Sinica , 2021 , 49 ( 7 ): 1249 - 1256 . (in Chinese)

徐频捷 , 陈逸杰 , 李之南 , 等 . 基于事件驱动的车道线识别算法研究 [J]. 电子学报 , 2021 , 49 ( 7 ): 1379 - 1385 .

XU P J , CHEN Y J , LI Z N , et al . Research on event-driven lane recognition algorithms [J]. Acta Electronica Sinica , 2021 , 49 ( 7 ): 1379 - 1385 . (in Chinese)

李倩玉 , 蒋建国 , 齐美彬 . 基于改进深层网络的人脸识别算法 [J]. 电子学报 , 2017 , 45 ( 3 ): 619 - 625 .

LI Q Y , JIANG J G , QI M B . Face recognition algorithm based on improved deep networks [J]. Acta Electronica Sinica , 2017 , 45 ( 3 ): 619 - 625 . (in Chinese)

OKSUZ K , CAM B C , KALKAN S , et al . Imbalance problems in object detection: A review [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 10 ): 3388 - 3415 .

REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: Unified, real-time object detection [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 779 - 788 .

LIU W , ANGUELOV D , ERHAN D , et al . SSD: Single shot multibox detector [C]// ECCV 2016 . Cham : Springer International Publishing , 2016 : 21 - 37 .

LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 936 - 944 .

LIU S , QI L , QIN H F , et al . Path aggregation network for instance segmentation [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 8759 - 8768 .

GUO C X , FAN B , ZHANG Q , et al . AugFPN: Improving multi-scale feature learning for object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 12592 - 12601 .

TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [EB/OL]. ( 2019-05-28 )[ 2021-06-01 ]. https://arxiv.org/abs/1905.11946 https://arxiv.org/abs/1905.11946 .

MA J Q , SHAO W Y , YE H , et al . Arbitrary-oriented scene text detection via rotation proposals [J]. IEEE Transactions on Multimedia , 2018 , 20 ( 11 ): 3111 - 3122 .

ZHOU X Y , YAO C , WEN H , et al . EAST: An efficient and accurate scene text detector [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 2642 - 2651 .

JIANG Y Y , ZHU X Y , WANG X B , et al . R 2 CNN: Rotational region CNN for orientation robust scene text detection [EB/OL]. ( 2017-06-29 )[ 2021-06-01 ]. https://arxiv.org/abs/1706. 09579 https://arxiv.org/abs/1706.09579 .

YANG X , YAN J C , FENG Z M , et al . R3Det: Refined single-stage detector with feature refinement for rotating object [EB/OL]. ( 2019-08-15 )[ 2021-06-01 ]. https://arxiv.org/abs/1908. 05612 https://arxiv.org/abs/1908.05612 .

DING J , XUE N , LONG Y , et al . Learning RoI transformer for detecting oriented objects in aerial images [EB/OL]. ( 2018-12-01 )[ 2021-06-01 ]. https://arxiv.org/abs/1812.00155 https://arxiv.org/abs/1812.00155 .

LIN T Y , GOYAL P , GIRSHICK R , et al . Focal loss for dense object detection [C]// 2017 IEEE International Conference on Computer Vision . Piscataway : IEEE , 2017 : 2999 - 3007 .

ALBAWI S , MOHAMMED T A , AL-ZAWI S . Understanding of a convolutional neural network [C]// 2017 International Conference on Engineering and Technology (ICET) . Piscataway : IEEE , 2017 : 1 - 6 .

IANDOLA F N , HAN S , MOSKEWICZ M W , et al . SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0 . 5 MB model size[EB/OL]. ( 2016-02-24 )[ 2021-06-01 ]. https://arxiv.org/abs/1602.07360 https://arxiv.org/abs/1602.07360 .

HOWARD A G , ZHU M L , CHEN B , et al . MobileNets: Efficient convolutional neural networks for mobile vision applications [EB/OL]. ( 2017-04-17 )[ 2021-06-01 ]. https://arxiv.org/abs/1704.04861 https://arxiv.org/abs/1704.04861 .

SIFRE L , MALLAT S . Rigid-motion scattering for texture classification [EB/OL]. ( 2014-03-07 )[ 2021-06-01 ]. https://arxiv.org/abs/1403.1687 https://arxiv.org/abs/1403.1687 .

AGARAP A F . Deep learning using rectified linear units (ReLU) [EB/OL]. ( 2018-03-22 )[ 2021-06-01 ]. https://arxiv.org/abs/1803. 08375 https://arxiv.org/abs/1803.08375 .

SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: Inverted residuals and linear bottlenecks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4510 - 4520 .

HOWARD A , SANDLER M , CHEN B , et al . Searching for MobileNetV3 [C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 1314 - 1324 .

TAN M X , CHEN B , PANG R M , et al . MnasNet: Platform-aware neural architecture search for mobile [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 2815 - 2823 .

ZHANG X Y , ZHOU X Y , LIN M X , et al . ShuffleNet: An extremely efficient convolutional neural network for mobile devices [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 6848 - 6856 .

WANG R J , LI X , LING C X . Pelee: A real-time object detection system on mobile devices [EB/OL]. ( 2018-04-18 )[ 2021-06-01 ]. https://arxiv.org/abs/1804.06882 https://arxiv.org/abs/1804.06882 .

HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1577 - 1586 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .

DAQUAN Z , HOU Q B , CHEN Y P , et al . Rethinking bottleneck structure for efficient mobile network design [EB/OL]. ( 2020-07-05 )[ 2021-06-01 ]. https://arxiv.org/abs/ 2007.02269 https://arxiv.org/abs/2007.02269 .

XIA G S , BAI X , DING J , et al . DOTA: A large-scale dataset for object detection in aerial images [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 3974 - 3983 .

REDMON J , FARHADI A . YOLOv3: An incremental improvement [EB/OL]. ( 2018-04-08 )[ 2021-06-01 ]. https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Multi-Stage Feature Fusion Object Detection Method for Remote Sensing Image

AI-DETR: Interpretable Object Detection Method Based on Adaptive Weighting

Open World Object Detection Based on Causal Prompt Distillation

Low Illumination Image Object Detection Method Based on ICFIE-YOLO

Related Author

CHEN Li

ZHANG Fan

GUO Wei

HUANG Yun

LU Yin-yuan

XU Sheng-quan

XIE Juan-ying

ZHAO Jia-qi

Related Institution

Information Engineering University

National Digital Switching System Engineering Technology Research Center

School of Computer Science, Shaanxi Normal University

Collegel of Life Sciences, Shaanxi Normal University

School of Computer Science and Technology, China University of Mining and Technology

⁰