电子学报 ›› 2020, Vol. 48 ›› Issue (12): 2384-2393.DOI: 10.3969/j.issn.0372-2112.2020.12.014

• 学术论文 • 上一篇    下一篇

基于深度帧差卷积神经网络的运动目标检测方法研究

欧先锋1, 晏鹏程1, 王汉谱1, 涂兵1, 何伟1, 张国云1, 徐智2   

  1. 1. 湖南理工学院信息科学与工程学院机器视觉与人工智能研究中心, 湖南岳阳 414006;
    2. 桂林电子科技大学广西图像图形智能处理重点实验室, 广西桂林 541004
  • 收稿日期:2020-04-23 修回日期:2020-09-27 出版日期:2020-12-25
    • 通讯作者:
    • 张国云, 徐智
    • 作者简介:
    • 欧先锋 男,1983年7月生于湖南郴州.现为湖南理工学院副教授、硕士生导师.主要研究方向为计算机视觉、高光谱遥感图像处理.E-mail:ouxf@hnist.edu.cn;晏鹏程 男,1995年6月生于湖南益阳.现为湖南理工学院硕士生.主要研究方向为深度学习框架与算法、计算机视觉.E-mail:530865028@qq.com;王汉谱 男,1997年4月生于江苏盐城.现为湖南理工学院硕士生.主要研究方向为深度学习框架与算法、计算机视觉.E-mail:1215051195@qq.com;涂兵 男,1983年1月生于湖南岳阳.现为湖南理工学院副教授、硕士生导师.主要研究方向为计算机视觉、高光谱遥感图像处理.E-mail:tubing@hnist.edu.cn;何伟 男,1983年1月生于湖南岳阳.现为湖南理工学院副教授、硕士生导师.主要研究方向为计算机视觉、机器学习.E-mail:hewei@hnist.edu.cn
    • 基金资助:
    • 湖南省自然科学基金项目 (No.2020JJ4340,No.2020JJ4343); 国家自然科学基金 (No.61662014); 湖南省教育厅优秀青年项目 (No.19B245); 湖南省研究生教育创新工程和专业能力提升工程项目 (No.CX20201114); 湖南省三维重建与智能应用技术工程研究中心 (No.2019-430602-73-03-006049); 湖南省应急通信工程技术研究中心 (No.2018TP2022); 广西科技基地和人才专项 (No.AD19110022)

Research of Moving Object Detection Based on Deep Frame Difference Convolution Neural Network

OU Xian-feng1, YAN Peng-cheng1, WANG Han-pu1, TU Bing1, HE Wei1, ZHANG Guo-yun1, XU Zhi2   

  1. 1. School of Information and Communication Engineering, Machine Vision & Artificial Intelligence Research Center, Hunan Institute of Science and Technology, Yueyang, Hunan 414006, China;
    2. Guangxi Key Laboratory of Images and Graphics Intelligent Processing, Guilin University of Electronics Technology, Guilin, Guangxi 541004, China
  • Received:2020-04-23 Revised:2020-09-27 Online:2020-12-25 Published:2020-12-25
    • Corresponding author:
    • ZHANG Guo-yun, XU Zhi
    • Supported by:
    • Program of Natural Science Foundation of Hunan Province,  China (No.2020JJ4340, No.2020JJ4343); National Natural Science Foundation of China (No.61662014); Excellent Youth Program of Education Department of Hunan Province (No.19B245); Hunan Postgraduate Education Innovation Project and Professional Ability Enhancement Project (No.CX20201114); Engineering Research Center of 3D Reconstruction and Intelligent Application Technology of Hunan Province (No.2019-430602-73-03-006049); Emergency Communication Engineering Technology Research Center of Hunan Province (No.2018TP2022); Guangxi Science and Technology Base and Talent Program (No.AD19110022)

摘要: 复杂场景中的运动目标检测是计算机视觉领域的重要问题,其检测准确度仍然是一大挑战.本文提出并设计了一种用于复杂场景中运动目标检测的深度帧差卷积神经网络(Deep Difference Convolutional Neural Network,DFDCNN).DFDCNN由DifferenceNet和AppearanceNet组成,不需要后处理就可以预测分割前景像素.DifferenceNet具有孪生Encoder-Decoder结构,用于学习两个连续帧之间的变化,从输入(t帧和t+1帧)中获取时序信息;AppearanceNet用于从输入(t帧)中提取空间信息,并与时序信息融合;同时,通过多尺度特征图融合和逐步上采样来保留多尺度空间信息,以提高网络对小目标的敏感性.在公开标准数据集CDnet2014和I2R上的实验结果表明:DFDCNN不仅在动态背景、光照变化和阴影存在的复杂场景中具有更好的检测性能,而且在小目标存在的场景中也具有较好的检测效果.

关键词: 运动目标检测, 复杂场景, 深度帧差卷积神经网络, 时序信息, 空间信息, 多尺度特征图融合

Abstract: Moving object detection in complex scenes is an important problem in computer vision domain, and the detection accuracy is still a great challenge. In this paper, we propose and design a deep frame difference convolution neural network (DFDCNN) for moving object detection in complex scenes. DFDCNN consists of DifferenceNet and AppearanceNet, which can predict and segment the foreground pixels simultaneously without post-processing. DifferenceNet has Siamese Encoder-Decoder structure, which is used to learn changes between two consecutive frames and to obtain temporal information from inputs, while AppearanceNet is used to extract spatial information from the input frame, and fuse the temporal information and spatial information by fusion of feature maps. Finally, multi-scale spatial information is retained through multi-scale feature map fusion and stepwise up-sampling to improve the sensitivity to small objects. Experiments on two public standard datasets: CDnet2014 and I2R demonstrate that the proposed DFDCNN outperforms the classic algorithms significantly from both qualitative and quantitative aspects. The experimental results illustrate that the proposed DFDCNN shows much better detection performance in complex scenes where dynamic background, illumination variation and shadow exist, and there is improvement for scenes, in which small objects exist.

Key words: moving object detection, complex scenes, deep frame difference convolutional neural network, temporal information, spatial information, multi-scale feature map fusion

中图分类号: