RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks

doi:10.3969/j.issn.0372-2112.2018.05.035

您当前的位置：

首页 >

文章列表页 >

RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks

更新时间：2025-07-08

- RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks
- Acta Electronica Sinica Vol. 46, Issue 5, Pages: 1253-1258(2018)
- 作者机构：
  
  1. 哈尔滨工程大学计算机科学与技术学院,黑龙江,哈尔滨,150001
  2. 西北工业大学航空学院,陕西,西安,710072
  3. 哈尔滨工程大学计算机科学与技术学院,黑龙江,哈尔滨,150001
  4. 西北工业大学航空学院,陕西,西安,710072
- 作者简介：
- 基金信息：
  
  National Key Research and Development Program of China (No.2016YFB1000400);Harbin Outstanding Youth Talents Fund of Heilongjiang Province (No.2017RAYXJ016);Free Exploration Foundation for Central Universities (No.HEUCF170605);National Natural Science Foundation of China (No.61573284)
- DOI：10.3969/j.issn.0372-2112.2018.05.035
  CLC： TP391.413TP18
- Published Online：25 May 2018，
  
  Published：2018
- 稿件说明：
移动端阅览
RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks[J]. Acta Electronica Sinica, 2018, 46(5): 1253-1258.
DOI：

RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks[J]. Acta Electronica Sinica, 2018, 46(5): 1253-1258. DOI： 10.3969/j.issn.0372-2112.2018.05.035.

摘要

为了弥补RGB-D场景解析中卷积神经网络空间结构化学习能力的不足，本文基于深度学习提出空间结构化推理深度融合网络，内嵌的结构化推理层有机地结合条件随机场和空间结构化推理模型，该层能够较为全面而准确地学习物体所处三维空间的物体分布以及物体间的三维空间位置关系.在此基础上，网络的特征融合层巧妙地利用深度置信网络和改进的条件随机场，该层可以根据融合生成的物体综合语义信息和物体间语义相关性信息完成深度结构化学习.实验结果表明，在标准RGB-D数据集NYUDv2和SUNRGBD上，空间结构化推理深度融合网络分别实现最优的平均准确率53.8%和54.6%，从而有助于实现机器人任务规划、车辆自动驾驶等智能计算机视觉任务.

Abstract

In order to make up the drawbacks that convolutional neural networks lack the ability of spatial structured learning in RGB-D scene parsing

we propose spatial structured inference deep fusion networks (SSIDFNs) on the basis of deep learning

the embedded structural inference layer organically combines conditional random fields (CRFs) and spatial structured inference model

which is able to learn the three-dimensional spatial distributions of objects and three-dimensional spatial relationships among objects in a more comprehensive and accurate way.Furthermore

the feature fusion layer takes both advantages of deep belief networks and improved CRFs

which is able to achieve deep structured learning according to the comprehensive semantic information of objects and semantic correlation in formation among objects.The experimental results demonstrate that the proposed SSIDFNs achieve the best mean accuracy 53.8% and 54.6% on the standard RGB-D datasets NYUDv2 and SUNRGBD respectively

which will be helpful to implement intelligent computer vision tasks

such as robot task planning and self-driving cars.

关键词

Keywords

references

Views

315

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Continual Learning Methods and Applications in Computer Vision

Neural Network Based Image Style Transfer: A Survey

DRHA-UIE: An Underwater Image Enhancement Method Based on Dual Residual Hybrid Attention Block

Feature-Space Optimization-Inspired and Self-Attention Enhanced Neural Network Reconstruction Algorithm for Image Compressive Sensing

A Survey on Deep Predictive Learning Based on Unlabeled Videos

Related Author

FANG Yan

Wei Yun-chao

Cong Run-min

Zuo Wang-meng

Zhao Yao

WANG Wei

ZHANG Jing-yi

WEN Yu-hui

Related Institution

School of Computer Science and Technology, Beijing Jiaotong University

School of Control Science and Engineering, Shandong University

School of Computer Science and Technology, Harbin Institute of Technology

School of Computer Science and Technology, Beijing Jiaotong University

College of Computer Science and Technology， Jilin University

⁰