3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法

田胜景; 韩一男; 赵宪通; 刘秀平; 张明

doi:10.12263/DZXB.20231009

您当前的位置：

首页 >

文章列表页 >

3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法

学术论文 | 更新时间：2026-05-07

- 3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法
- LiDAR Point Cloud Tracking Method Using Point-Voxel Relationship Modeling Under 3D Sparse Convolutional Framework
- 电子学报 2024年52卷第10期页码：3527-3540
- 作者机构：
  
  1.中国矿业大学经济管理学院，江苏徐州 221116
  2.大连理工大学白俄罗斯国立大学联合学院，辽宁大连 116024
  3.大连理工大学数学科学学院，辽宁大连 116024
- 作者简介：
  
  [ "田胜景男，1994年出生，山东济宁人.于大连理工大学获得博士学位，现入职中国矿业大学师资博士后.主要研究方向为点云理解、3D视觉.E-mail: tye.dut@gmail.com" ]
  [ "韩一男男，1994年出生，辽宁沈阳人.于东北大学获得学士学位，现为大连理工大学博士研究生.主要研究方向为点云理解、跨模态表示学习." ]
  [ "赵宪通男，1999年出生，山东泰安人.于中国矿业大学获得学士学位，现为大连理工大学博士研究生.主要研究方向为点云跟踪、多模态学习." ]
  [ "刘秀平女，1964年出生，辽宁鞍山人.于吉林大学获得博士学位，现为大连理工大学教授.主要研究方向为计算机视觉、计算机图形学." ]
  [ "张明男，1980年出生，山东博兴人.于大连理工大学获得博士学位，现为中国矿业大学教授.主要研究方向为大数据管理与应用、复杂网络." ]
- 基金信息：
  
  国家自然科学基金(62301562);中国博士后科学基金(2023M733756);中央高校基本科研业务费专项资金资助(2023QN1055)
- DOI：10.12263/DZXB.20231009
  中图分类号： TP391.4;
- 收稿：2023-10-27，
  
  修回：2024-05-19，
  
  纸质出版：2024-10-25
- 稿件说明：
移动端阅览
田胜景, 韩一男, 赵宪通, 等. 3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法[J]. 电子学报, 2024, 52(10): 3527-3540.

TIAN Sheng-jing, HAN Yi-nan, ZHAO Xian-tong, et al. LiDAR Point Cloud Tracking Method Using Point-Voxel Relationship Modeling Under 3D Sparse Convolutional Framework[J]. Acta Electronica Sinica, 2024, 52(10): 3527-3540.
田胜景, 韩一男, 赵宪通, 等. 3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法[J]. 电子学报, 2024, 52(10): 3527-3540. DOI：10.12263/DZXB.20231009

TIAN Sheng-jing, HAN Yi-nan, ZHAO Xian-tong, et al. LiDAR Point Cloud Tracking Method Using Point-Voxel Relationship Modeling Under 3D Sparse Convolutional Framework[J]. Acta Electronica Sinica, 2024, 52(10): 3527-3540. DOI：10.12263/DZXB.20231009

摘要

稀疏卷积在处理激光雷达点云单目标跟踪时的潜力尚未得到充分发掘.目前，绝大多数点云跟踪算法使用基于球邻域的骨干网络，其显存计算资源占用大并且目标感知的关系建模不充分.针对此问题，本文提出一种基于稀疏卷积结构的LiDAR（Lightlaser Detection And Ranging）点云跟踪算法，并创新性地融合了空间点与体素双通道的关系建模模块，以高效适应稀疏框架下目标判别信息的嵌入.首先，本文采用3D稀疏卷积残差网络来分别提取模板和搜索区域的特征，并利用反卷积来获取逐点特征来保证跟踪任务中对空间位置特性的要求.其次，关系建模模块进一步在模板与搜索区域特征之间计算相似度语义查询表.为了捕捉到模板与搜索区域间细粒度的关联性，该模块一方面在空间点通道中利用近邻算法找出每个搜索区域点的模板近邻点，并根据语义查询表提取对应特征；另一方面，在体素通道中以每个搜索区域点为中心构建局部多尺度体素，并根据落入体素单元的模板点索引计算语义查询表中值的累计和.最后，将双通道的特征融合并送入基于鸟瞰图的候选包围盒生成模块来回归目标包围盒.为了验证所提出方法的优越性，本文在KITTI和NuScenes数据集进行了测试，对比其他使用稀疏卷积的算法，本文方法平均成功率和精确率分别提升了11.0%和12.0%.本文方法在继承了稀疏卷积高效特点的同时还实现了跟踪精度的提高.

Abstract

The potential of sparse convolution in the field of single target tracking from LiDAR (Lightlaser Detection And Ranging) point cloud has not been fully explored. The vast majority of point cloud tracking algorithms use point-based backbone networks which require higher computation costs and the target-aware relationship modeling is insufficient. To address this problem

this paper proposes a 3D target tracking algorithm based on a sparse convolutional framework

and incorporates it with a point-voxel dual channel relationship modeling module to facilitate the embedding of target discrimination information in the such sparse framework. Firstly

this work uses a 3D convolutional residual network to extract the features of the template and search area separately

then uses deconvolution to obtain pointwise features for the spatial position in tracking tasks. Secondly

the relationship modeling module further calculates a semantic similarity query table based on the above features of the template and the search area. In order to capture the fine-grained correlation

on the one hand

the module utilizes the nearest neighbor algorithm in the spatial point channel to find the template points for each search area point

and extracts corresponding features based on the query table; on the other hand

local multi-scale voxels are constructed with each search area point as the center in the voxel channel

and the accumulated similarity of templates falling into voxel units is used as clues to extract features. Finally

the dual channel feature fusion is sent into the candidate bounding box generation module based on bird’s-eye view to estimate the target bounding box. To verify the superiority of the proposed method

we evaluated it on the KITTI and NuScenes datasets

and compared with the baseline algorithm adopting sparse convolution

the mean success and precision rates achieved a considerable improvement of 11.0% and 12.0%. The proposed method not only inherits the efficient characteristics of sparse convolution but also improves tracking accuracy.

关键词

Keywords

references

张伟俊 , 钟胜 , 徐文辉 , 等 . 融合显著性与运动信息的相关滤波跟踪算法 [J ] . 自动化学报 , 2021 , 47 ( 7 ): 1572 - 1588 .

ZHANG W J , ZHONG S , XU W H , et al . Correlation filter based visual tracking integrating saliency and motion cues‍ [J ] . Acta Automatica Sinica , 2021 , 47 ( 7 ): 1572 - 1588 . (in Chinese)

陈丹 , 姚伯羽 . 运动模型引导的自适应核相关目标跟踪方法 [J ] . 电子学报 , 2021 , 49 ( 3 ): 550 - 558 .

CHEN D , YAO B Y . Adaptive response kernel correlation target tracking method guided by motion model [J ] . Acta Electronica Sinica , 2021 , 49 ( 3 ): 550 - 558 . (in Chinese)

林彬 , 王华通 , 封全喜 . 基于双模型竞争机制的目标跟踪算法 [J ] . 电子学报 , 2023 , 51 ( 5 ): 1381 - 1387 .

LIN B , WANG H T , FENG Q X . Object tracking algorithm based on dual-model competition mechanism [J ] . Acta Electronica Sinica , 2023 , 51 ( 5 ): 1381 - 1387 . (in Chinese)

黄鹤 , 李文龙 , 吴琨 , 等 . 动态自适应特征融合的MFOPA跟踪器 [J ] . 电子学报 , 2023 , 51 ( 5 ): 1350 - 1358 .

HUANG H , LI W L , WU K , et al . MFOPA tracker with dynamic adaptive feature fusion [J ] . Acta Electronica Sinica , 2023 , 51 ( 5 ): 1350 - 1358 . (in Chinese)

MARVASTI-ZADEH S M , CHENG L , GHANEI-YAKHDAN H , et al . Deep learning for visual tracking: A comprehensive survey [J ] . IEEE Transactions on Intelligent Transportation Systems , 2022 , 23 ( 5 ): 3943 - 3968 .

LI P L , QIN T , SHEN S J . Stereo vision-based semantic 3D object and ego-motion tracking for autonomous driving‍ [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 664 - 679 .

SINGH P , AGRAWAL P , KARKI H , et al . Vision-based guidance and switching-based sliding mode controller for a mobile robot in the cyber physical framework [J ] . IEEE Transactions on Industrial Informatics , 2019 , 15 ( 4 ): 1985 - 1997 .

WANG B , WU V , WU B C , et al . LATTE: Accelerating LiDAR point cloud annotation via sensor fusion, one-click annotation, and tracking [C ] // 2019 IEEE Intelligent Transportation Systems Conference (ITSC) . Piscataway : IEEE , 2019 : 265 - 272 .

ASVADI A , GIRÃO P , PEIXOTO P , et al . 3D object tracking using RGB and LIDAR data [C ] // 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC) . Piscataway : IEEE , 2016 : 1255 - 1260 .

PANG Z Q , LI Z C , WANG N Y . Model-free vehicle tracking and state estimation in point cloud sequences [C ] // Proceedings of the International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2021 : 8075 - 8082 .

GIANCOLA S , ZARZAR J , GHANEM B . Leveraging shape completion for 3D Siamese tracking [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 1359 - 1368 .

ZHENG C D , YAN X , GAO J T , et al . Box-aware feature enhancement for single object tracking on point clouds‍ [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 13199 - 13208 .

QI H Z , FENG C , CAO Z G , et al . P 2 B: Point-to-box network for 3D object tracking in point clouds [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 6329 - 6338 .

HUI L , WANG L P , CHENG M M , et al . 3D Siamese voxel-to-bev tracker for sparse point clouds [C ] // Proceedings of Advances in Neural Information Processing Systems (NeruIPS) . San Diego : MIT Press , 2021 : 28714 - 28727 .

ZOU H , CUI J H , KONG X , et al . F-siamese tracker: A frustum-based double Siamese network for 3D single object tracking [C ] // 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 8133 - 8139 .

ZHOU C Q , LUO Z P , LUO Y R , et al . PTTR: relational 3D point cloud object tracking with Transformer [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 8521 - 8530 .

SHAN J Y , ZHOU S F , CUI Y B , et al . Real-time 3D single object tracking with Transformer [J ] . IEEE Transactions on Multimedia , 2022 , 25 : 2339 - 2353 .

QI C R , YI L , SU H , et al . PointNet++: Deep hierarchical feature learning on point sets in a metric space [C ] // Proceedings of the 31st International Conference on Neural Information Processing Systems . Red Hook : Curran Associates Inc. , 2017 : 5105 - 5114 .

YAN Y , MAO Y X , LI B . SECOND: Sparsely embedded convolutional detection [J ] . Sensors , 2018 , 18 ( 10 ): 3337 .

GRAHAM B , ENGELCKE M , VAN DER MAATEN L . 3D semantic segmentation with submanifold sparse convolutional networks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 9224 - 9232 .

CHOY C , GWAK J AND SAVARESE S . 4D spatio-temporal convnets: Minkowski convolutional neural networks‍ [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 3070 - 3079 .

CUI Y B , FANG Z , SHAN J Y , et al . 3D object tracking with Transformer [C ] // Proceedings of British Machine Vision Conference (BMVC) . Durham : British Machine Vision Association , 2021 : 1 - 13 .

SHI S S , GUO C X , JIANG L , et al . PV-RCNN: Point-voxel feature set abstraction for 3D object detection [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 10526 - 10535 .

WEI Y , WANG Z Y , RAO Y M , et al . PV-RAFT: Point-voxel correlation fields for scene flow estimation of point clouds [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 6950 - 6959 .

GEIGER A , LENZ P , STILLER C , et al . Vision meets robotics: The KITTI dataset [J ] . International Journal of Robotics Research , 2013 , 32 ( 11 ): 1231 - 1237 .

CAESAR H , BANKITI V , LANG A H , et al . NuScenes: A multimodal dataset for autonomous driving [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11618 - 11628 .

CHOY C , PARK J , KOLTUN V . Fully convolutional geometric features [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 8957 - 8965 .

李宗民 , 姚纯纯 , 刘玉杰 , 等 . 点云场景下基于结构感知的车辆检测 [J ] . 计算机辅助设计与图形学学报 , 2021 , 33 ( 3 ): 405 - 412 .

LI Z M , YAO C C , LIU Y J , et al . Vehicle detection based on structure perception in point cloud [J ] . Journal of Computer-Aided Design & Computer Graphics , 2021 , 33 ( 3 ): 405 - 412 . (in Chinese)

SHI S S , WANG X G , LI H S . PointRCNN: 3D object proposal generation and detection from point cloud [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 770 - 779 .

周锋 , 陶重犇 , 张祖峰 , 等 . 体素点云融合的三维动态目标检测算法 [J ] . 计算机辅助设计与图形学学报 , 2022 , 34 ( 6 ): 901 - 912 .

ZHOU F , TAO C B , ZHANG Z F , et al . 3D dynamic target detection algorithm based on voxel point cloud fusion‍ [J ] . Journal of Computer-Aided Design & Computer Graphics , 2022 , 34 ( 6 ): 901 - 912 . (in Chinese)

FANG Z , ZHOU S F , CUI Y B , et al . 3D-SiamRPN: An end-to-end learning method for real-time 3D single object tracking using raw point cloud [J ] . IEEE Sensors Journal , 2021 , 21 ( 4 ): 4995 - 5011 .

QI C R , LITANY O , HE K M , et al . Deep Hough voting for 3D object detection in point clouds [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 9276 - 9285 .

WANG Z T , XIE Q , LAI Y K , et al . MLVSNet: Multi-level voting siamese network for 3D visual tracking [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 3101 - 3110 .

TIAN S J , LIU B , TAN H C , et al . Deep supervised descent method with multiple seeds generation for 3D tracking in point cloud [J ] . IEEE Transactions on Industrial Informatics , 2022 , 18 : 5077 - 5086 .

ZHOU X Y , WANG L , YUAN Z A , et al . Structure aware 3D single object tracking of point cloud [J ] . Journal of Electronic Imaging , 2021 , 30 ( 4 ): 043010 .

PARK M , SEONG H , JANG W , et al . Graph-based point tracker for 3D object tracking in point clouds [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto : AAAI Press , 2022 : 2053 - 2061 .

ZHENG C D , YAN X , ZHANG H M , et al . Beyond 3D Siamese tracking: A motion-centric paradigm for 3D single object tracking in point clouds [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 8101 - 8110 .

WU Y , LIM J , YANG M H . Object tracking benchmark‍ [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 , 37 ( 9 ): 1834 - 1848 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于梯度协同与特征融合的加密流量检测

融合大语言模型与跨域图结构的多模态对话情感识别方法

基于镜像数据增强与多尺度特征卷积融合的Transformer调制识别技术

基于空频双域特征融合的高迁移性对抗样本生成方法