融合时空上下文信息的强化学习小目标快速搜索

姜虹; 马姣姣; 姚红革; 程嗣怡; 陈游; 喻钧

doi:10.12263/DZXB.20220617

您当前的位置：

首页 >

文章列表页 >

融合时空上下文信息的强化学习小目标快速搜索

学术论文 | 更新时间：2025-12-08

- 融合时空上下文信息的强化学习小目标快速搜索
- Rapid Search for Small Object in Reinforcement Learning by Combining Spatio-Temporal Contextual Information
- 电子学报 2023年51卷第11期页码：3176-3186
- 作者机构：
  
  1.西安工业大学计算机科学与工程学院，陕西西安 710021
  2.空军工程大学航空工程学院，陕西西安 710038
- 作者简介：
  
  [ "姜虹女，1977年出生，陕西宝鸡人.现为西安工业大学计算机科学与工程学院副教授，硕士生导师.主要研究方向为软件工程、图像处理、神经网络与深度学习等.E-mail: 249479898@qq.com" ]
  [ "马姣姣（女，1997年出生，陕西宝鸡人.现为西安工业大学计算机科学与工程学院硕士研究生.主要研究方向为机器学习、计算机视觉等.E-mail: 2578516632@qq.com" ]
  [ "姚红革男，1978年出生，陕西西安人.现为西安工业大学计算机科学与工程学院副教授，硕士生导师.主要研究方向为机器学习、计算机视觉、人工智能等.E-mail: yaohongge@xatu.edu.cn" ]
  [ "程嗣怡男，1980年出生，江苏南京人.现为空军工程大学航空工程学院教授，硕士生导师.主要研究方向为目标检测、电子对抗等.中国电子学会会员编号：E190050619M.E-mail: csy_316@163.com" ]
  [ "陈游男，1983年出生，湖南岳阳人.现为空军工程大学航空工程学院副教授，硕士生导师.主要研究方向为信息对抗理论与技术等.E-mail: chenyousky@163.com" ]
  [ "喻钧女，1971年出生，重庆人.现为西安工业大学计算机科学与工程学院教授，硕士生导师.主要研究方向为图像处理、模式识别等.E-mail: yujun@xatu.edu.cn" ]
- 基金信息：
- DOI：10.12263/DZXB.20220617
  中图分类号： TP391.4;
- 收稿：2022-05-27，
  
  修回：2022-08-31，
  
  纸质出版：2023-11-25
- 稿件说明：
移动端阅览
姜虹,马姣姣,姚红革等.融合时空上下文信息的强化学习小目标快速搜索[J].电子学报,2023,51(11):3176-3186.

JIANG Hong,MA Jiao-jiao,YAO Hong-ge,et al.Rapid Search for Small Object in Reinforcement Learning by Combining Spatio-Temporal Contextual Information[J].ACTA ELECTRONICA SINICA,2023,51(11):3176-3186.
姜虹,马姣姣,姚红革等.融合时空上下文信息的强化学习小目标快速搜索[J].电子学报,2023,51(11):3176-3186. DOI： 10.12263/DZXB.20220617.

JIANG Hong,MA Jiao-jiao,YAO Hong-ge,et al.Rapid Search for Small Object in Reinforcement Learning by Combining Spatio-Temporal Contextual Information[J].ACTA ELECTRONICA SINICA,2023,51(11):3176-3186. DOI： 10.12263/DZXB.20220617.

摘要

人眼在搜索目标时，先基于此前的扫视经验粗略扫视，找到可能有目标的位置，再进行详细搜索.前者的扫视可称为基于时间上下文信息的扫视，后者可称为基于位置上下文信息的搜索.受人眼这种目标搜索模式启发，本文提出一种结合强化学习的时空上下文目标搜索方法.该方法基于强化学习搜索策略构建时间上下文模块，获得时间上下文信息；再通过构建一个自适应多尺度窗口提取位置上下文信息，两种信息在目标搜索过程中交替配合，完成目标搜索.实验结果表明，该方法在MS COCO数据集上较基准方法提升了2.9%，且可在5个搜索次数内找到目标.

Abstract

When searching for a object

the human eye first roughly scans based on previous scanning experience to find potential locations for the object

and then conducts a detailed search. The former can be referred to as scanning based on temporal contextual information

while the latter can be referred to as searching based on location contextual information. Inspired by this

this paper proposes a rapid search method for small objects based on reinforcement learning that integrates spatio-temporal context information. The method builds a temporal context module based on a reinforcement learning search strategy to simulate the human eye's ability to obtain and utilize empirical information

then constructs an adaptive multi-scale window to extract location context information to simulate the human eye's ability to search carefully at possible locations. The two kinds of information cooperate alternately in the object search process to complete the object search. The experimental results show that the proposed algorithm brings around 2.9% gain on MS COCO benchmark

and can find an object within five search counts.

关键词

Keywords

references

MEYE A F , O'KEEFE J , POORT J . Two distinct types of eye-head coupling in freely moving mice [J ] . Current Biology , 2020 , 30 ( 11 ): 2116 - 2130 .

MNIH V , KAVUKCUOGLU K , SILVER D , et al . Human-level control through deep reinforcement learning [J ] . Nature , 2015 , 518 ( 7540 ): 529 - 533 .

LIU S , QI L , QIN H F , et al . Path aggregation network for instance segmentation [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 8759 - 8768 .

LENG J X , REN Y H , JIANG W X , et al . Realize your surroundings: Exploiting context information for small object detection [J ] . Neurocomputing , 2021 , 433 : 287 - 299 .

EVERINGHAM M , VAN GOOL L , WILLIAMS C K I , et al . The pascal visual object classes (VOC) challenge [J ] . International Journal of Computer Vision , 2010 , 88 ( 2 ): 303 - 338 .

LIN T Y , MAIRE M , BELONGIE S , et al . Microsoft COCO: Common objects in context [C ] // European Conference on Computer Vision . Cham : Springer , 2014 : 740 - 755 .

REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: Towards real-time object detection with region proposal networks [C ] // Proceedings of the 28th International Conference on Neural Information Processing Systems . Cambridge : MIT Press , 2015 : 91 - 99 .

REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: Unified, real-time object detection [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 779 - 788 .

LIU W , ANGUELOV D , ERHAN D , et al . SSD: Single shot MultiBox detector [C ] // European Conference on Computer Vision . Cham : Springer , 2016 : 21 - 37 .

李宝奇 , 贺昱曜 , 强伟 , 等 . 基于并行附加特征提取网络的SSD地面小目标检测模型 [J ] . 电子学报 , 2020 , 48 ( 1 ): 84 - 91 .

LI B Q , HE Y Y , QIANG W , et al . SSD with parallel additional feature extraction network for ground small target detection [J ] . Acta Electronica Sinica , 2020 , 48 ( 1 ): 84 - 91 . (in Chinese)

CAO G M , XIE X M , YANG W Z , et al . Feature-fused SSD: Fast detection for small objects [C ] // Proceeding SPIE 10615 , Ninth International Conference on Graphic and Image Processing (ICGIP2017) . Bellingham : SPIE , 2018 : 381 - 388 .

LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 936 - 944 .

CHEN Z , HUANG S L , TAO D C . Context refinement for object detection [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 74 - 89 .

冷佳旭 , 刘莹 . 基于深度学习的小目标检测与识别 [J ] . 数据与计算发展前沿 , 2020 , 2 ( 2 ): 120 - 135 .

LENG J X , LIU Y . Small object detection and recognition based on deep learning [J ] . Frontiers of Data & Computing , 2020 , 2 ( 2 ): 120 - 135 . (in Chinese)

TANG X , DU D K , HE Z Q , et al . PyramidBox: A context-assisted single shot face detector [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 812 - 828 .

PATO L V , NEGRINHO R , AGUIAR P M Q . Seeing without looking: Contextual rescoring of object detections for AP maximization [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 14598 - 14606 .

FU K , LI J , MA L , et al . Intrinsic relationship reasoning for small object detection [EB/OL ] . ( 2020-09-02 )[ 2022-04-06 ] . https://arxiv.org/abs/2009.00833 https://arxiv.org/abs/2009.00833 .

LIM J S , ASTRID M , YOON H J , et al . Small object detection using context and attention [C ] // 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) . Piscataway : IEEE , 2021 : 181 - 186 .

MNIH V , HEESS N , GRAVES A . Recurrent models of visual attention [C ] // Proceedings of the 27th International Conference on Neural Information Processing Systems . Cambridge : MIT Press , 2014 : 2204 - 2212 .

程旭 , 宋晨 , 史金钢 , 等 . 基于深度学习的通用目标检测研究综述 [J ] . 电子学报 , 2021 , 49 ( 7 ): 1428 - 1438 .

CHENG X , SONG C , SHI J G , et al . A survey of generic object detection methods based on deep learning [J ] . Acta Electronica Sinica , 2021 , 49 ( 7 ): 1428 - 1438 . (in Chinese)

KONG T , SUN F C , YAO A B , et al . RON: Reverse connection with objectness prior networks for object detection [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 5244 - 5252 .

CAICEDO J C , LAZEBNIK S . Active object localization with deep reinforcement learning [C ] // 2015 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2016 : 2488 - 2496 .

JIE Z Q , LIANG X D , FENG J S , et al . Tree-structured reinforcement learning for sequential object localization [C ] // Proceedings of the 30th International Conference on Neural Information Processing Systems . Red Hook : Curran Associates Inc. , 2016 : 127 - 135 .

ZHOU M , WANG R J , XIE C J , et al . ReinforceNet: A reinforcement learning embedded object detection framework with region selection network [J ] . Neurocomputing , 2021 , 443 : 369 - 379 .

VIOLA P , JONES M . Rapid object detection using a boosted cascade of simple features [C ] // Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2001 : I .

UIJLINGS J R R , VAN DE SANDE K E A , GEVERS T , et al . Selective search for object recognition [J ] . International Journal of Computer Vision , 2013 , 104 ( 2 ): 154 - 171 .

HE K M , GKIOXARI G , DOLLÁR P , et al . Mask R-CNN [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 2980 - 2988 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2014-09-04 )[ 2022-04-06 ] . https://arxiv.org/abs/1409.1556 https://arxiv.org/abs/1409.1556 .

ITTI L , KOCH C . Computational modelling of visual attention [J ] . Nature Reviews Neuroscience , 2001 , 2 ( 3 ): 194 - 203 .

BUENO M B , NIETO X G , MARQUES F , et al . Hierarchical object detection with deep reinforcement learning [J ] . Deep Learning for Image Processing Applications , 2017 , 31 ( 164 ): 3 .

MATHE S , PIRINEN A , SMINCHISESCU C . Reinforcement learning for visual object detection [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 2894 - 2902 .

LI Y , HAN X C , GE L T , et al . A recurrent reinforcement learning approach for small object detection with dynamic refinement [C ] // 2021 International Joint Conference on Neural Networks (IJCNN) . Piscataway : IEEE , 2021 : 1 - 8 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于图组合优化的高效社区搜索

知识数据协同的多对手智能空中博弈策略设计

基于强化学习的免调参即插即用单光子图像重建方法

基于强化学习的离散事件系统最优定向监控