电子学报 ›› 2020, Vol. 48 ›› Issue (4): 631-636.DOI: 10.3969/j.issn.0372-2112.2020.04.002

• 学术论文 • 上一篇    下一篇

基于语义分割的双目场景流估计

陈震1, 马龙1, 张聪炫1,2, 黎明1, 吴俊劼1, 江少锋1   

  1. 1. 南昌航空大学无损检测技术教育部重点实验室, 江西南昌 330063;
    2. 中国科学院自动化研究所, 北京 100190
  • 收稿日期:2018-12-07 修回日期:2019-05-13 出版日期:2020-04-25
    • 通讯作者:
    • 张聪炫
    • 作者简介:
    • 陈震 男,1969年11月生,江西九江人.分别于1993、2000和2003年在西北工业大学获得学士、硕士和博士学位.现为南昌航空大学教授,博士生导师,主要研究方向为计算机视觉、图像处理与模式识别.E-mail:dr_chenzhen@163.com;马龙 男,1993年9月出生于河南省鹤壁市.现为南昌航空大学测试与光电工程学院硕士研究生.主要研究方向为图像检测与智能识别.E-mail:1007516637@qq.com
    • 基金资助:
    • 国家自然科学基金 (No.61866026,No.61772255,No.61866025); 江西省优势科技创新团队计划 (No.20152BCB24004,No.20165BCB19007); 江西省青年科学基金 (No.20171BAB212012); 中国博士后科学基金 (No.2019M650894)

Binocular Scene Flow Estimation Based on Semantic Segmentation

CHEN Zhen1, MA Long1, ZHANG Cong-xuan1,2, LI Ming1, WU Jun-jie1, JIANG Shao-feng1   

  1. 1. Key Laboratory of Nondestructive Testing (Ministry of Education), Nanchang Hangkong University, Nanchang, Jiangxi 330063, China;
    2. Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2018-12-07 Revised:2019-05-13 Online:2020-04-25 Published:2020-04-25
    • Corresponding author:
    • ZHANG Cong-xuan
    • Supported by:
    • National Natural Science Foundation of China (No.61866026, No.61772255, No.61866025); Superior Science and Technology Innovation Team Project of Jiangxi Province (No.20152BCB24004, No.20165BCB19007); Youth Science Fund of Jiangxi Province (No.20171BAB212012); China Postdoctoral Science Foundation (No.2019M650894)

摘要: 针对现有场景流计算方法在复杂场景、大位移和运动遮挡等情况下易产生运动边缘模糊的问题,提出一种基于语义分割的双目场景流估计方法.首先,根据图像中的语义信息类别,通过深度学习的卷积神经网络模型将图像划分为带有语义标签的区域;针对不同语义类别的图像区域分别进行运动建模,利用语义知识计算光流信息并通过双目立体匹配的半全局匹配方法计算图像视差信息.然后,对输入图像进行超像素分割,通过最小二乘法耦合光流和视差信息,分别求解每个超像素块的运动参数.最后,在优化能量函数中添加语义分割边界的约束信息,通过更新像素到超像素块的映射关系和超像素块到移动平面的映射关系得到最终的场景流估计结果.采用KITTI 2015标准测试图像序列对本文方法和代表性的场景流计算方法进行对比分析.实验结果表明,本文方法具有较高的精度和鲁棒性,尤其对于复杂场景、运动遮挡和运动边缘模糊的图像具有较好的边缘保护作用.

关键词: 语义分割, 场景流, 深度学习, 双目立体匹配, 最小二乘法, 超像素分割, 运动遮挡, 边缘保护

Abstract: In order to address the issue of motion boundary blurring caused by the complex scenes, large displacement and motion occlusion, this paper proposes a binocular scene flow estimation method based on semantic segmentation. Firstly, by using the image semantic information, we classify the image regions into several categories with semantic labels through convolutional neural networks. Then we plan the motion models of various image regions according to the different semantic categories and compute the optical flow and disparity under the prior knowledge of semantic information. Secondly, we apply the superpixel segmentation to the input image and couple the optical flow and disparity information via least squares method to solve the motion parameters of each superpixel patch. Finally, we add the boundary information of semantic segmentation constraint to the optimization energy function, and estimate the scene flow by updating the mappings of pixels-to-superpixel and superpixel-to-plane. We evaluate the proposed approach and some state-of-the-art methods on the KITTI 2015 database to conduct a comparison experiment. The experimental results demonstrate that our method has high accuracy and good robustness, and especially has significant benefit of boundary preserving in the areas of complex scene, motion occlusion and motion boundary.

Key words: semantic segmentation, scene flow, deep learning, binocular stereo matching, least squares method, superpixel segmentation, motion occlusion, edge protection

中图分类号: