电子学报 ›› 2020, Vol. 48 ›› Issue (8): 1528-1537.DOI: 10.3969/j.issn.0372-2112.2020.08.010

• 学术论文 • 上一篇    下一篇

带特征监控的高维信息编解码端到端无标记人体姿态估计网络

沈栎, 陈莹   

  1. 江南大学轻工过程先进控制教育部重点实验室, 江苏无锡 214000
  • 收稿日期:2019-05-28 修回日期:2019-11-19 出版日期:2020-08-25 发布日期:2020-08-25
  • 作者简介:沈 栎 男,1993年6月出生于贵州都匀,江南大学硕士研究生,目前主要研究方向为三维图像处理和人体姿态估计. E-mail:6161918015@vip.jiangnan.edu.cn;陈 莹 女,1976年生于浙江丽水,江南大学教授,博士生导师,目前主要研究方向为图像处理、信息融合、模式识别. E-mail:chenying@jiangnan.edu.cn
  • 基金资助:
    国家自然科学基金(No.61573168)

Feature Monitored High-Dimension Endecoder Net for End to End Markless Human Pose Estimation

SHEN Li, CHEN Ying   

  1. Key Laboratory of Advanced Control Light Process, Jiangnan University, Wuxi, Jiangsu 214000, China
  • Received:2019-05-28 Revised:2019-11-19 Online:2020-08-25 Published:2020-08-25

摘要: 针对点云空间三维信息非结构化和旋转易变性对预测结果的影响,提出一种带特征监控的三维信息编解码卷积神经网络,该网络可实现三维空间下单目深度图的端对端无标记人体姿态估计.所设计的网络由特征监控编解码组件串联而成,该组件第一部分使用三维卷积模块以类似沙漏结构的形式组合设计,实现对特征图的编码和解码;第二部分以不同参数残差块并联,实现对特征图的监控融合,第一部分与第二部分首尾连接构成组件.特征监控编解码组件能根据数据集大小,通过串联的方式搭建不同深度的网络,同时根据数据分辨率,设置组件参数,实现由粗到精的特征学习,最终获得最佳网络.通过ITOP数据库的实验表明,该网络实现了空间三维信息的端到端深度学习,显著提高了系统性能并具有更高的精度.

关键词: 计算机视觉, 深度图, 人体姿态估计, 深度学习, 三维数据卷积网络

Abstract: Aiming at the impact of unstructured and rotational variability of three-dimensional information in point cloud on prediction results,a feature-supervised three-dimensional information encoding and decoding convolution deep learning network is proposed.The network is composed of feature monitoring coding and decoding modules in series.In the first part of the module,a three-dimensional convolution module is used in the form of hourglass structure to realize the coding and decoding of the feature map.In the second part,the residual blocks of different parameters are connected in parallel to realize the monitoring and fusion of feature maps.Feature monitored coding and decoding modules can build networks with different depths in series according to the size of data sets.At the same time,according to the data resolution,modules parameters can be set to realize feature learning from rough to fine,and ultimately obtain the best network.The experiment of ITOP database shows that the network achieves the end-to-end deep learning of three-dimensional information,significantly improves the system performance and has higher precision accuracy.

Key words: computer vision, depth image, pose estimation, deep learning, 3D-CNN

中图分类号: