面向三维多目标追踪的运动补偿优化方法

王顺洪; 张昱; 沈江楠; 吉建民; 张燕咏

doi:10.12263/DZXB.20220104

您当前的位置：

首页 >

文章列表页 >

面向三维多目标追踪的运动补偿优化方法

学术论文 | 更新时间：2025-12-11

- 面向三维多目标追踪的运动补偿优化方法
- Motion Compensation Optimization Method for 3D Multi-Object Tracking
- 电子学报 2024年52卷第2期页码：528-539
- 作者机构：
  
  中国科学技术大学计算机科学与技术学院，安徽合肥 230027
- 作者简介：
  
  [ "王顺洪男，1998年生.中国科学技术大学硕士生.主要研究方向为边缘计算、自动驾驶系统的感知等. E-mail: wangshunhong@mail.ustc.edu.cn" ]
  [ "张昱女，1972年生.博士，教授，CCF杰出会员.主要研究方向为面向新兴领域的编程系统与优化、如智能无人系统、量子计算等.E-mail: yuzhang@ustc.edu.cn" ]
  [ "沈江楠女，2000年生.中国科学技术大学本科生.主要研究方向为自动驾驶系统的感知、机器学习等. E-mail: jnshen.ustc@gmail.com" ]
  [ "吉建民男，1984年生.博士，副教授，CCF会员.主要研究方向为移动机器人、深度强化学习等. E-mail: jianmin@ustc.edu.cn" ]
  [ "张燕咏女，1975年生.博士，教授，CCF会员.主要研究方向为边缘计算、人工智能物联网、无人系统的感知等. E-mail: yanyongzhang_ustc@ustc.edu.cn" ]
- 基金信息：
  
  科技创新2030—“新一代人工智能”重大项目(2018AAA0100500);国家自然科学基金(62272434);安徽省重点研究与开发计划标准化专项(202104h04020039)
- DOI：10.12263/DZXB.20220104
  中图分类号： TP399;
- 收稿：2022-01-21，
  
  修回：2022-12-12，
  
  纸质出版：2024-02-25
- 稿件说明：
移动端阅览
王顺洪,张昱,沈江楠,等.面向三维多目标追踪的运动补偿优化方法[J].电子学报,2024,52(02):528-539.

WANG Shun-hong, ZHANG Yu, SHEN Jiang-nan, et al.Motion Compensation Optimization Method for 3D Multi-Object Tracking[J].Acta Electronica Sinica, 2024, 52(02): 528-539.
王顺洪,张昱,沈江楠,等.面向三维多目标追踪的运动补偿优化方法[J].电子学报,2024,52(02):528-539. DOI：10.12263/DZXB.20220104

WANG Shun-hong, ZHANG Yu, SHEN Jiang-nan, et al.Motion Compensation Optimization Method for 3D Multi-Object Tracking[J].Acta Electronica Sinica, 2024, 52(02): 528-539. DOI：10.12263/DZXB.20220104

摘要

三维多目标追踪是自动驾驶系统中的关键模块之一，其结果的优劣主要取决于追踪模块中数据关联过程的准确度.现有的追踪方法大多从外观特征或运动特征出发计算两帧之间物体的相似度，而基于运动特征的方法通常根据当前帧和历史帧三维包围框之间的交并比（Intersection over Union，IoU）进行关联，然而这种方式在观测点物体自身运动时存在严重缺陷.在观测点物体自身进行运动时，观测到的两帧数据将处于不同的局部坐标系，导致无法使用运动模型准确预测已追踪物体在下一帧中的位置.本文针对上述问题，通过引入观测点自身的惯性测量单元（Inertial Measurement Unit，IMU）或全球定位系统（Global Positioning System，GPS）数据，在一帧数据到达之后计算当前帧局部坐标系与上一帧局部坐标系之间的旋转和平移关系，并对已追踪的物体状态按得到的坐标变换关系进行运动补偿，使其抵消因观测点自身运动造成的偏移量.这种运动补偿增强了追踪模块的数据关联环节，提高追踪时三维包围框的关联成功率，降低误关联数量，改善三维多目标追踪的精度.在相关追踪框架及KITTI数据集上的原型验证表明，所提的运动补偿优化方法实现了1%左右的精度提升.

Abstract

Three-dimensional (3D) multi-object tracking is a key module in the autonomous driving system

and the quality of the tracking results mainly depends on the accuracy of the data association process in the tracking module. Existing tracking methods mostly calculate the similarity of objects between two frames from appearance characteristics or motion characteristics

while methods based on motion characteristics usually associate the current frame with the historical frame by using the intersection over union (IoU) of three-dimensional bounding box. However

this method has serious drawbacks when the observation point is moving. When the observation point is moving

the data observed in two frames would lie in different local coordinate systems

making it impossible to use the motion model to accurately predict the position of the tracked objects in the next frame. To solve the above problems

this paper introduces the inertial measurement unit (IMU) or the global positioning system (GPS) data of the observation point itself

and caculates the relationship of rotation and translation between local coordinate systems of the current and the previous frames after each frame data arrives then the state of the tracked object is compensated according to the obtained coordinate transformation relationship

making it counteract the offset caused by the movement of the observation point itself. This motion compensation enhances the data association process in the tracking module

improving the correlation success rate of the 3D bounding boxes

reducing the number of false correlations

and improving the accuracy of 3D multi-object tracking. The prototype verification on related tracking frameworks and the KITTI dataset shows the proposed motion compensation optimization method achieves an accuracy improvement of about 1%.

关键词

Keywords

references

WENG X S , WANG J R , HELD D , et al . 3D multi-object tracking: A baseline and new evaluation metrics [C]//ZHANG H. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 10359 - 10366 .

WU H , HAN W , WEN C , et al . 3D multi-object tracking in point clouds based on prediction confidence-guided data association [J]. IEEE Transactions on Intelligent Transportation Systems , 2021 : 1 - 10 .

DA SILVA CARVALHO M D , KOARK F , RHEINLÄNDER C , et al . Real-time image recognition system based on an embedded heterogeneous computer and deep convolutional neural networks for deployment in constrained environments [C]// SAE Technical Paper Series . Warrendale : SAE International , 2019 . DOI: 10.4271/2019-01-1045 http://dx.doi.org/10.4271/2019-01-1045 .

BATENI S , LIU C . Predictable data-driven resource management: an implementation using autoware on autonomous platforms [C]//CHEN M. 2019 IEEE Real-Time Systems Symposium (RTSS) . Piscataway : IEEE , 2019 : 339 - 352 .

GEIGER A , LENZ P , URTASUN R . Are we ready for autonomous driving? The KITTI vision benchmark suite [C]// 2012 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2012 : 3354 - 3361 .

LEAL-TAIXÉ L , MILAN A , REID I , et al . Motchallenge 2015: Towards a benchmark for multi-target tracking [EB/OL]. ( 2015 )[2022]. https://arxiv.org/abs/1504.01942 https://arxiv.org/abs/1504.01942 .

CAESAR H , BANKITI V , LANG A H , et al . nuscenes: A multimodal dataset for autonomous driving [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11621 - 11631 .

钟平 , 冯进良 , 于前洋 , 等 . 动态图像序列帧间运动补偿方法探讨 [J]. 光学技术 , 2003 , 29 ( 4 ): 441 - 444 .

ZHONG P , FENG J L , YU Q Y , et al . Discussion on motion compensation method between frames of dynamic image sequence [J]. Optical Technique , 2003 , 29 ( 4 ): 441 - 444 . (in Chinese)

NAKAYA Y , HARASHIMA H . Motion compensation based on spatial transformations [J]. IEEE Transactions on circuits and systems for video technology , 1994 , 4 ( 3 ): 339 - 356 .

WENG X , WANG Y , MAN Y , et al . Gnn3dmot: Graph neural network for 3d multi-object tracking with 2d-3d multi-feature learning [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 6499 - 6508 .

SHENOI A , PATEL M , GWAK J Y , et al . Jrmot: A real-time 3d multi-object tracker and a new large-scale dataset [C]//ZHANG H. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 10335 - 10342 .

BASER E , BALASUBRAMANIAN V , BHATTACHARYYA P , et al . Fantrack: 3D multi-object tracking with feature association network [C]//Arnaud de la Fortelle. 2019 IEEE Intelligent Vehicles Symposium (IV) . Piscataway : IEEE , 2019 : 1426 - 1433 .

SCHUBERT R , RICHTER E , WANIELIK G . Comparison and evaluation of advanced motion models for vehicle tracking [C]//Jürgen Grosche. 2008 11th International Conference on Information Fusion . Piscataway : IEEE , 2008 : 1 - 6 .

BRAZIL G , PONS-MOLL G , LIU X , et al . Kinematic 3d object detection in monocular video [C]//Vittorio Ferrari. European Conference on Computer Vision . GLASGOW : Springer , 2020 : 135 - 152 .

JIA S , PEI X , JING X , et al . Self-supervised 3d reconstruction and ego-motion estimation via on-board monocular video [J]. IEEE Transactions on Intelligent Transportation Systems , 2022 , 23 ( 7 ): 7557 - 7569 .

BLOESCH M , BURRI M , OMARI S , et al . Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback [J]. The International Journal of Robotics Research , 2017 , 36 ( 10 ): 1053 - 1072 .

SAHOO B , BIGLARBEGIAN M , MELEK W . Monocular visual inertial direct SLAM with robust scale estimation for ground robots/vehicles [J]. Robotics , 2021 , 10 ( 1 ): 23 .

BERNARDIN K , STIEFELHAGEN R . Evaluating multiple object tracking performance: the clear mot metrics [J]. EURASIP Journal on Image and Video Processing , 2008 , 2008 : 1 - 10 .

ROBUSTO C C . The cosine-haversine formula [J]. The American Mathematical Monthly , 1957 , 64 ( 1 ): 38 - 40 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

自动驾驶中的3D目标检测研究进展

时域注意力特征对齐的视频压缩感知重构网络

结合显著性的MVD视频整帧丢失错误隐藏

面向空地一体化交通的虚拟车道：发展阶段与关键技术