电子学报 ›› 2023, Vol. 51 ›› Issue (1): 192-201.DOI: 10.12263/DZXB.20211688

• 学术论文 • 上一篇    下一篇

基于伪孪生神经网络的低纹理工业零件6D位姿估计

王神龙, 雍宇, 吴晨睿   

  1. 上海理工大学机械工程学院,上海 200093
  • 收稿日期:2021-12-23 修回日期:2022-03-23 出版日期:2023-01-25
    • 通讯作者:
    • 吴晨睿
    • 作者简介:
    • 王神龙 男,1989年8月出生于安徽省安庆市.上海理工大学机械工程学院副教授、硕士生导师.主要研究方向为随机动力学与控制、机器视觉与机器学习等.E-mail: shenlongwang@usst.edu.cn
      雍 宇 男,1996年11月出生于江苏省扬州市.上海理工大学机械工程学院硕士研究生.研究方向为目标检测、6D位姿估计.E-mail: 1219817191@qq.com
      吴晨睿(通讯作者) 男,1989年9月出生于黑龙江省哈尔滨市.上海理工大学机械工程学院讲师、硕士生导师.主要研究方向为机器人视觉伺服控制、工业零件目标位姿估计等.E-mail: wuchenrui@usst.edu.cn
    • 基金资助:
    • 国家自然科学基金青年项目(52105525);National Natural Science of China(52105525);国家自然科学基金面上项目(12172226);National Natural Science of China(12172226)

6D Pose Estimation of Low Texture Industrial Parts Based on Pseudo-Siamese Neural Network

WANG Shen-long, YONG Yu, WU Chen-rui   

  1. College of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
  • Received:2021-12-23 Revised:2022-03-23 Online:2023-01-25 Published:2023-02-23
    • Corresponding author:
    • WU Chen-rui

摘要:

从单帧RGB图像中获取目标物体的6D位姿信息在机器人抓取、虚拟现实、自动驾驶等领域应用广泛.本文针对低纹理物体位姿估计精度不足的问题,提出一种基于伪孪生神经网络的位姿估计方法.首先,通过渲染CAD模型的方式,获取不同观察角度下的RGB图作为训练样本,解决了深度学习中数据集获取与标注较为繁琐的问题.其次,利用伪孪生神经网络结构学习二维图像特征和物体的三维网格模型特征之间的相似性,即分别采用全卷积网络和三维点云语义分割网络构成伪孪生神经网络,提取二维图像和三维模型的高维深层特征,使用网络推断密集的二维-三维对应关系.最后,通过PnP-RANSAC方法恢复物体的位姿.仿真数据集的实验结果表明,本文提出的方法具有较高的准确性和鲁棒性.

关键词: 深度学习, 6D位姿估计, 仿真数据集, 伪孪生神经网络, 点向密集匹配

Abstract:

Obtaining the 6D pose information of the target object from a single frame RGB image is widely used in the fields of robot capture, virtual reality, automatic driving, and so on. Aiming at the problem of insufficient accuracy of pose estimation of low texture objects, a pose estimation method based on pseudo-siamese neural network is proposed in this paper. Firstly, RGB images from different viewing angles are obtained as training samples by rendering CAD models, which solves the cumbersome problem of data set acquisition and annotation in deep learning. Secondly, the pseudo-siamese neural network structure is used to learn the similarity between the two-dimensional image features and the three-dimensional mesh model features of the object, that is, the full convolution network and the three-dimensional point cloud semantic segmentation network are used to form the pseudo-siamese neural network, extract the high-dimensional deep features of the two-dimensional image and the three-dimensional model, and use the network to infer the dense two-dimensional three-dimensional correspondence. Finally, the pose of the object is restored by PNP-RANSAC method. The experimental results of simulation data sets show that the proposed method has high accuracy and robustness.

Key words: deep learning, 6D pose estimation, simulation data set, pseudo-siamese neural network, dense matching of points

中图分类号: