面向遥感图像的多阶段特征融合目标检测方法

陈立; 张帆; 郭威; 黄赟

doi:10.12263/DZXB.20211421

您当前的位置：

首页 >

文章列表页 >

面向遥感图像的多阶段特征融合目标检测方法

学术论文 | 更新时间：2025-12-11

- 面向遥感图像的多阶段特征融合目标检测方法
- Multi-Stage Feature Fusion Object Detection Method for Remote Sensing Image
- 电子学报 2023年51卷第12期页码：3520-3528
- 作者机构：
  
  1.信息工程大学，河南郑州 450001
  2.国家数字交换系统工程技术研究中心，河南郑州 450002
- 作者简介：
  
  [ "陈立男，1997年2月出生于浙江省义乌市.信息工程大学硕士生.主要研究方向为计算机视觉.E-mail: 2464863136@qq.com" ]
  [ "张帆（通讯作者）男，1981年9月出生.博士.现为国家数字交换系统工程技术研究中心副研究员、硕士生导师.主要研究方向为主动防御、人工智能、高性能计算.E-mail: 17034203@qq.com" ]
  [ "郭威男，1990年8月出生.博士.现为国家数字交换系统工程技术研究中心助理研究员.主要研究方向为主动防御、人工智能、高性能计算.E-mail: guowjss@126.com" ]
  [ "黄赟男，1993年9月出生于江西省新余市.信息工程大学硕士生.主要研究方向为神经网络模型量化压缩.E-mail: yyhuangz@163.com" ]
- 基金信息：
  
  国家自然科学基金创新研究群体项目(61521003)
- DOI：10.12263/DZXB.20211421
  中图分类号： TP391
- 收稿：2021-10-18，
  
  修回：2021-12-18，
  
  纸质出版：2023-12-25
- 稿件说明：
移动端阅览
陈立,张帆,郭威等.面向遥感图像的多阶段特征融合目标检测方法[J].电子学报,2023,51(12):3520-3528.

CHEN Li,ZHANG Fan,GUO Wei,et al.Multi-Stage Feature Fusion Object Detection Method for Remote Sensing Image[J].ACTA ELECTRONICA SINICA,2023,51(12):3520-3528.
陈立,张帆,郭威等.面向遥感图像的多阶段特征融合目标检测方法[J].电子学报,2023,51(12):3520-3528. DOI： 10.12263/DZXB.20211421.

CHEN Li,ZHANG Fan,GUO Wei,et al.Multi-Stage Feature Fusion Object Detection Method for Remote Sensing Image[J].ACTA ELECTRONICA SINICA,2023,51(12):3520-3528. DOI： 10.12263/DZXB.20211421.

摘要

遥感图像目标具有多尺度、大横纵比、多角度等特性，给传统的目标检测方法带来了新的挑战.针对现有方法应用于目标尺度小、横纵比例不均衡的遥感图像时存在的精度下降问题，提出一种基于多阶段特征融合的目标检测方法MF2M（Multi-stage Feature Fusion Method）.该方法在一阶段对特征图通道进行组合拆分，再采用卷积拼接的融合方式聚合通道维度的特征，从而强化输出的目标空间轮廓信息；二阶段设计多比例的非对称卷积块，增强大横纵比目标的高维全局特征，改善目标与检测框匹配粗糙的问题，同时利用串并行相结合的处理方式减少冗余卷积参数，加速网络收敛.在DOTA（Dataset for Object deTection in Aerial images）数据集上的实验结果表明，基准方法引入MF2M后，在保证检测速度的前提下精度指标mAP提高至76.44%，结果验证了所提算法的有效性与可靠性.

Abstract

The remote sensing image objects has the characteristics of multi-scale

large aspect ratio

multi-angle and so on

which brings new challenges to traditional object detection methods. To solve the problem of loss of accuracy when existing methods are applied to remote sensing images with small object scales and unbalanced aspect ratios

an object detection method based on dual-stage feature fusion—MF2M (Multi-stage Feature Fusion Method) is proposed. This method combines and splits the feature map channels in first stage

and then adopts the fusion method of convolution splicing to aggregate the characteristics of the channel dimensions

thereby enhancing the output object spatial contour information; in the second stage

we design a multi-scale asymmetric convolution blocks

enhancing the high-dimensional global features of large aspect ratio targets

improving the problem of rough matching between the target and the detection frame

and using a combination of serial and parallel processing to reduce redundant convolution parameters. Finally

we achieve the effect of accelerating network convergence. The experimental results on the DOTA (Dataset for Object deTection in Aerial images) dataset show that after the benchmark method is introduced into MF2M

the accuracy index mAP is increased to 76.44% under the premise of ensuring the detection speed. The results verify the effectiveness and reliability of the algorithm.

关键词

Keywords

references

罗会兰 , 陈鸿坤 . 基于深度学习的目标检测研究综述 [J ] . 电子学报 , 2020 , 48 ( 6 ): 1230 - 1239 .

LUO H L , CHEN H K . Survey of object detection based on deep learning [J ] . Acta Electronica Sinica , 2020 , 48 ( 6 ): 1230 - 1239 . (in Chinese)

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

LIN T Y , MAIRE M , BELONGIE S , et al . Microsoft COCO: Common objects in context [C ] // European Conference on Computer Vision . Cham : Springer , 2014 : 740 - 755 .

EVERINGHAM M , VAN GOOL L , Williams C K I , et al . The pascal visual object classes (VOC) challenge [J ] . International journal of computer vision , 2010 , 88 ( 2 ): 303 - 338 .

OKSUZ K , CAM B C , KALKAN S , et al . Imbalance problems in object detection: A review [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 10 ): 3388 - 3415

HE K M , ZHANG X Y , REN S Q , et al . Spatial pyramid pooling in deep convolutional networks for visual recognition [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 , 37 ( 9 ): 1904 - 1916 .

REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: Towards real-time object detection with region proposal networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 .

HE K M , GKIOXARI G , DOLLÁR P , et al . Mask R-CNN [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 2980 - 2988 .

REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: Unified, real-time object detection [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 779 - 788 .

LIU W , ANGUELOV D , ERHAN D , et al . SSD: Single shot MultiBox detector [C ] // Computer Vision—ECCV 2016 . Cham : Springer International Publishing , 2016 : 21 - 37 .

ZHOU X Y , WANG D Q , KRÄHENBÜHL P . Objects as points [EB/OL ] . ( 2019-04-16 )[ 2021-10-01 ] . https://arxiv.org/abs/1904.07850 https://arxiv.org/abs/1904.07850 .

ZHOU X Y , ZHUO J C , KRÄHENBÜHL P . Bottom-up object detection by grouping extreme and center points [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 850 - 859 .

YANG X , YAN J C , FENG Z M , et al . R3Det: Refined single-stage detector with feature refinement for rotating object [EB/OL ] . ( 2019-08-15 )[ 2021-10-01 ] . https://arxiv.org/abs/1908.05612 https://arxiv.org/abs/1908.05612 .

MA J Q , SHAO W Y , YE H , et al . Arbitrary-oriented scene text detection via rotation proposals [J ] . IEEE Transactions on Multimedia , 2018 , 20 ( 11 ): 3111 - 3122 .

ZHOU X Y , YAO C , WEN H , et al . EAST: An efficient and accurate scene text detector [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2642 - 2651 .

JIANG Y Y , ZHU X Y , WANG X B , et al . R 2 CNN: Rotational region CNN for orientation robust scene text detection [EB/OL ] . ( 2017-06-29 )[ 2021-10-01 ] . https://arxiv.org/abs/1706.09579 https://arxiv.org/abs/1706.09579 .

LIU Z K , HU J G , WENG L B , et al . Rotated region based CNN for ship detection [C ] // 2017 IEEE International Conference on Image Processing (ICIP) . Piscataway : IEEE , 2018 : 900 - 904 .

DING J , XUE N , LONG Y , et al . Learning RoI transformer for detecting oriented objects in aerial images [EB/OL ] . ( 2018-12-01 )[ 2021-10-01 ] . https://arxiv.org/abs/1812. 00155 https://arxiv.org/abs/1812.00155 .

YANG X , HOU L P , ZHOU Y , et al . Dense label encoding for boundary discontinuity free rotation detection [EB/OL ] . ( 2020-11-19 )[ 2021-10-01 ] . https://arxiv.org/abs/2011.09670 https://arxiv.org/abs/2011.09670 .

王子晔 , 苗夺谦 , 赵才荣 , 等 . 基于多粒度特征的行人跟踪检测结合算法 [J ] . 计算机研究与发展 , 2020 , 57 ( 5 ): 996 - 1002 .

WANG Z Y , MIAO D Q , ZHAO C R , et al . A pedestrian tracking algorithm based on multi-granularity feature [J ] . Journal of Computer Research and Development , 2020 , 57 ( 5 ): 996 - 1002 . (in Chinese)

LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 936 - 944 .

LIU S , QI L , QIN H F , et al . Path aggregation network for instance segmentation [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 8759 - 8768 .

LIU S T , HUANG D , WANG Y H . Learning spatial fusion for single-shot object detection [EB/OL ] . ( 2019-11-27 )[ 2021-10-01 ] . https://arxiv.org/abs/1911.09516 https://arxiv.org/abs/1911.09516 .

GUO C X , FAN B , ZHANG Q , et al . AugFPN: Improving multi-scale feature learning for object detection [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 12592 - 12601 .

TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [EB/OL ] . ( 2019-05-28 )[ 2021-10-01 ] . https://arxiv.org/abs/1905.11946v5 https://arxiv.org/abs/1905.11946v5 .

HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1577 - 1586 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vision [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 2818 - 2826 .

DING X H , GUO Y C , DING G G , et al . ACNet: Strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 1911 - 1920 .

XIA G S , BAI X , DING J , et al . DOTA: A large-scale dataset for object detection in aerial images [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 3974 - 3983 .

YANG X , YANG J R , YAN J C , et al . SCRDet: Towards more robust detection for small, cluttered and rotated objects [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 8231 - 8240 .

LIN T Y , GOYAL P , GIRSHICK R , et al . Focal loss for dense object detection [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 2999 - 3007 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于级联式逆残差网络的遥感图像轻量目标检测算法

AI-DETR：自适应加权的可解释目标检测方法

基于因果提示蒸馏的开放世界目标检测

基于ICFIE-YOLO的低照度图像目标检测方法