Research Advances on 3D Object Detection in Autonomous Driving

CHEN Jian; SU Si-jiao; HUANG Li-qin; ZHAO Tie-song

doi:10.12263/DZXB.20250043

您当前的位置：

首页 >

文章列表页 >

Research Advances on 3D Object Detection in Autonomous Driving

SURVEYS AND REVIEWS | 更新时间：2025-10-16

- Research Advances on 3D Object Detection in Autonomous Driving
- ACTA ELECTRONICA SINICA Vol. 53, Issue 6, Pages: 2131-2156(2025)
- 作者机构：
  
  1.福州大学物理与信息工程学院，福建福州 350108
  2.媒体信息智能处理与无线传输福建省重点实验室，福建福州 350108
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62171134;62271149);Science and Technology Projects of Fuzhou City(2023-P-001)
- DOI：10.12263/DZXB.20250043
  CLC： TP391.4;
- Received：10 January 2025，
  
  Revised：2025-02-28，
  
  Published：25 June 2025
- 稿件说明：
移动端阅览
陈建, 苏思教, 黄立勤, 等. 自动驾驶中的3D目标检测研究进展[J]. 电子学报, 2025, 53(06): 2131-2156.

CHEN Jian, SU Si-jiao, HUANG Li-qin, et al. Research Advances on 3D Object Detection in Autonomous Driving[J]. Acta Electronica Sinica, 2025, 53(06): 2131-2156.
陈建, 苏思教, 黄立勤, 等. 自动驾驶中的3D目标检测研究进展[J]. 电子学报, 2025, 53(06): 2131-2156. DOI：10.12263/DZXB.20250043

CHEN Jian, SU Si-jiao, HUANG Li-qin, et al. Research Advances on 3D Object Detection in Autonomous Driving[J]. Acta Electronica Sinica, 2025, 53(06): 2131-2156. DOI：10.12263/DZXB.20250043

摘要

近年来，自动驾驶因其在提升道路安全、提高交通效率等方面展现出巨大的潜力而受到越来越多的关注.在现代自动驾驶系统中，感知系统扮演着至关重要的角色，其目标是准确地估计周围环境的状态，并为预测和规划提供可靠的观测信息.其中，3D目标检测作为感知系统的重要组成部分，旨在预测自动驾驶车辆周围物体的位置、大小和类别.本文归纳了近年来自动驾驶领域中3D目标检测的研究进展，从单模态检测和多模态融合检测两个角度出发，介绍了使用不同传感器进行单模态方法和多模态融合方法的优势和不足.此外，本文还对比了各种代表性算法在公共数据集上的性能，总结了当前常用训练策略，并讨论了该领域未来的发展趋势.

Abstract

In recent years

autonomous driving has gained increasing attention due to its significant potential in improving road safety and enhancing traffic efficiency. The perception system plays a crucial role in modern autonomous driving systems

aiming to accurately estimate the surrounding environment’s state and provide reliable observations for prediction and planning. Among them

3D object detection serves as an important component of the perception system for predicting the positions

sizes

and categories of objects surrounding the autonomous vehicle. This paper provides a comprehensive overview of the research advancements in 3D object detection for autonomous driving in recent years. It discusses the advantages and limitations of single-modal methods and multi-modal fusion methods using different sensors from the perspectives of single-modal detection and multi-modal fusion detection. Furthermore

the paper compares the performance of various representative algorithms on public datasets

summarizes the current commonly used training strategies

and discusses the future development directions in this field.

关键词

Keywords

references

WANG L , ZHANG X Y , SONG Z Y , et al . Multi-modal 3D object detection in autonomous driving: A survey and taxonomy [J ] . IEEE Transactions on Intelligent Vehicles , 2023 , 8 ( 7 ): 3781 - 3798 .

QIAN R , LAI X , LI X R . 3D object detection for autonomous driving: A survey [J ] . Pattern Recognition , 2022 , 130 : 108796 .

MA X Z , OUYANG W L , SIMONELLI A , et al . 3D object detection from images for autonomous driving: A survey [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2024 , 46 ( 5 ): 3537 - 3556 .

WANG Y J , MAO Q Y , ZHU H Q , et al . Multi-modal 3D object detection in autonomous driving: A survey [J ] . International Journal of Computer Vision , 2023 , 131 ( 8 ): 2122 - 2152 .

WANG K , ZHOU T Q , LI X C , et al . Performance and challenges of 3D object detection methods in complex scenes for autonomous driving [J ] . IEEE Transactions on Intelligent Vehicles , 2023 , 8 ( 2 ): 1699 - 1716 .

葛同澳 , 李辉 , 郭颖 , 等 . 基于双融合框架的多模态3D目标检测算法 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3100 - 3110 .

GE T A , LI H , GUO Y , et al . A multimodal 3D object detection method based on double-fusion framework [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3100 - 3110 . (in Chinese)

周治国 , 马文浩 . 一种多层多模态融合3D目标检测方法 [J ] . 电子学报 , 2024 , 52 ( 3 ): 696 - 708 .

ZHOU Z G , MA W H . 3D object detection based on multilayer multimodal fusion [J ] . Acta Electronica Sinica , 2024 , 52 ( 3 ): 696 - 708 . (in Chinese)

XU B , CHEN Z Z . Multi-level fusion based 3D object detection from monocular images [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 2345 - 2353 .

DUAN K W , BAI S , XIE L X , et al . CenterNet: Keypoint triplets for object detection [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 6569 - 6578 .

MA X Z , ZHANG Y M , XU D , et al . Delving into localization errors for monocular 3D object detection [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 4719 - 4728 .

ZHANG Y P , LU J W , ZHOU J . Objects are different: Flexible monocular 3D object detection [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 3288 - 3297 .

WANG T , ZHU X G , PANG J M , et al . FCOS3D: Fully convolutional one-stage monocular 3D object detection [C ] // 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) . IEEE , 2021 : 913 - 922 .

LIU X P , XUE N , WU T F . Learning auxiliary monocular contexts helps monocular 3D object detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2022 , 36 ( 2 ): 1810 - 1818 .

YAN L F , YAN P , XIONG S Z , et al . MonoCD: Monocular 3D object detection with complementary depths [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2024 : 10248 - 10257 .

RODDICK T , KENDALL A , CIPOLLA R . Orthographic feature transform for monocular 3D object detection [EB/OL ] . ( 2018-11-20 )[ 2025-02-26 ] . https://arxiv.org/pdf/1811.08188 https://arxiv.org/pdf/1811.08188 .

BRAZIL G , LIU X M . M3D-RPN: Monocular 3D region proposal network for object detection [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . IEEE , 2019 : 9287 - 9296 .

READING C , HARAKEH A , CHAE J L , et al . Categorical depth distribution network for monocular 3D object detection [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 8555 - 8564 .

KUMAR A , BRAZIL G , CORONA E , et al . DEVIANT: Depth equivariant network for monocular 3D object detection [C ] // Computer Vision-ECCV 2022 . Cham : Springer Nature Switzerland , 2022 : 664 - 683 .

HUANG K , WU T , SU H , et al . MonoDTR: Monocular 3D object detection with depth-aware transformer [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 4002 - 4011 .

LI Z Q , WANG W H , LI H Y , et al . BEVFormer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers [M ] // Computer Vision-ECCV 2022 . Cham : Springer Nature Switzerland , 2022 : 1 - 18 .

WANG Z Y , LI D W , LUO C X , et al . DistillBEV: Boosting multi-camera 3D object detection with cross-modal knowledge distillation [C ] // 2023 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2023 : 8603 - 8612 .

TAO R Z , HAN W C , QIU Z Y , et al . Weakly supervised monocular 3D object detection using multi-view projection and direction consistency [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2023 : 17482 - 17492 .

ZHANG R R , QIU H , WANG T , et al . MonoDETR: Depth-guided transformer for monocular 3D object detection [C ] // 2023 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2023 : 9121 - 9132 .

LI Z L , XU X G , LIM S , et al . UniMODE: Unified monocular 3D object detection [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2024 : 16561 - 16570 .

WANG Y , CHAO W L , GARG D , et al . Pseudo-Lidar from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 8437 - 8445 .

YOU Y , WANG Y , CHAO W L , et al . Pseudo-lidar++: Accurate depth for 3D object detection in autonomous d-riving [C ] // International Conference on Learning Representations . Britain : ML Research Press , 2020 : 1 - 22 .

PANDHARIPANDE A , CHENG C H , DAUWELS J , et al . Sensing and machine learning for automotive perception: A review [J ] . IEEE Sensors Journal , 2023 , 23 ( 11 ): 11097 - 11115 .

CHARLES R Q , HAO S , MO K C , et al . PointNet: Deep learning on point sets for 3D classification and segmentation [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 77 - 85 .

QI C R , YI L , SU H , et al . Pointnet++: Deep hierarchical feature learning on point sets in a metric space [C ] // Advances in Neural Information Processing Systems . Virtual : PMLR , 2017 : 5099 - 5108 .

YANG Z T , SUN Y N , LIU S , et al . 3DSSD: Point-based 3D single stage object detector [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11040 - 11048 .

LI Z C , WANG F , WANG N Y . LiDAR R-CNN: An efficient and universal 3D object detector [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 7546 - 7555 .

SHENGA H L , CAI S J , LIU Y , et al . Improving 3D object detection with channel-wise transformer [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 2723 - 2732 .

QIAN G , LI Y , PENG H , et al . PointNext: Revisiting pointnet++ with improved training and scaling strategies [C ] // Advances in Neural Information Processing Systems . Virtual : PMLR , 2022 : 23192 - 23204 .

HUANG K , LYU W J , YANG M , et al . PTT: Point-trajectory transformer for efficient temporal 3D object detection [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 14938 - 14947 .

ZHOU Y , TUZEL O . VoxelNet: End-to-end learning for point cloud based 3D object detection [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4490 - 4499 .

YAN Y , MAO Y X , LI B . SECOND: Sparsely embedded convolutional detection [J ] . Sensors , 2018 , 18 ( 10 ): 3337 .

DENG J J , SHI S S , LI P W , et al . Voxel R-CNN: Towards high performance voxel-based 3D object detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2021 , 35 ( 2 ): 1201 - 1209 .

HE C H , LI R H , LI S , et al . Voxel set transformer: A set-to-set approach to 3D object detection from point clouds [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 8407 - 8417 .

CHEN Y K , LI Y W , ZHANG X Y , et al . Focal sparse convolutional networks for 3D object detection [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 5418 - 5427 .

LIU L , SONG Z Y , XIA Q M , et al . SparseDet: A simple and effective framework for fully sparse LiDAR-based 3-D object detection [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2024 , 62 : 5707114 .

KOH J , LEE J , LEE Y , et al . MGTANet: Encoding sequential LiDAR points using long short-term motion-guided temporal attention for 3D object detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 1 ): 1179 - 1187 .

CHEN Y K , LIU J H , ZHANG X Y , et al . VoxelNeXt: Fully sparse VoxelNet for 3D object detection and tracking [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2023 : 21674 - 21683 .

ZHANG G , CHEN J N , GAO G H , et al . SAFDNet: A simple and effective network for fully sparse 3D object detection [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2024 : 14477 - 14486 .

YANG Z T , SUN Y N , LIU S , et al . STD: Sparse-to-dense 3D object detector for point cloud [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 1951 - 1960 .

HE C H , ZENG H , HUANG J Q , et al . Structure aware single-stage 3D object detection from point cloud [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11870 - 11879 .

SHI S S , GUO C X , JIANG L , et al . PV-RCNN: Point-voxel feature set abstraction for 3D object detection [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 10529 - 10538 .

SHI S S , JIANG L , DENG J J , et al . PV-RCNN++: Point-voxel feature set abstraction with local vector representation for 3D object detection [J ] . International Journal of Computer Vision , 2023 , 131 ( 2 ): 531 - 551 .

BANSAL K , RUNGTA K , ZHU S Y , et al . Pointillism: Accurate 3D bounding box estimation with multi-radars [C ] // Proceedings of the 18th Conference on Embedded Networked Sensor Systems . New York : ACM , 2020 : 340 - 353 .

PALFFY A , DONG J A , KOOIJ J F P , et al . CNN based road user detection using the 3D radar cube [J ] . IEEE Robotics and Automation Letters , 2020 , 5 ( 2 ): 1263 - 1270 .

LIU J N , XIONG W Y , BAI L P , et al . Deep instance segmentation with automotive radar detection points [J ] . IEEE Transactions on Intelligent Vehicles , 2023 , 8 ( 1 ): 84 - 94 .

SVENNINGSSON P , FIORANELLI F , YAROVOY A . Radar-PointGNN: Graph based object recognition for unstructured radar point-cloud data [C ] // 2021 IEEE Radar Conference . Piscataway : IEEE , 2021 : 1 - 6 .

ZHANG A , NOWRUZI F E , LAGANIERE R . RADDet: Range-azimuth-Doppler based radar object detection for dynamic road users [C ] // 2021 18th Conference on Robots and Vision (CRV) . Piscataway : IEEE , 2021 : 95 - 102 .

JIANG T Z , ZHUANG L , AN Q , et al . T-RODNet: Transformer for vehicular millimeter-wave radar object detection [J ] . IEEE Transactions on Instrumentation and Measurement , 2022 , 72 : 5003912 .

DECOURT C , VANRULLEN R , SALLE D , et al . DAROD: A deep automotive radar object detector on range-Doppler maps [C ] // 2022 IEEE Intelligent Vehicles Symposium (IV) . Piscataway : IEEE , 2022 : 112 - 118 .

MAJOR B , FONTIJNE D , ANSARI A , et al . Vehicle detection with automotive radar using deep learning on range-azimuth-Doppler tensors [C ] // 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) . Piscataway : IEEE , 2019 : 924 - 932 .

BAI J , ZHENG L Q , LI S , et al . Radar Transformer: An object classification network based on 4D MMW imaging radar [J ] . Sensors , 2021 , 21 ( 11 ): 3854 .

XU B W , ZHANG X Y , WANG L , et al . RPFA-Net: A 4D RaDAR pillar feature attention network for 3D object detection [C ] // 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) . Piscataway : IEEE , 2021 : 3061 - 3066 .

LIU J N , ZHAO Q C , XIONG W Y , et al . SMURF: Spatial multi-representation fusion for 3D object detection with 4D imaging radar [C ] // 2024 IEEE Intelligent Vehicles Symposium (IV) . Piscataway : IEEE , 2024 : 3141 .

PAN Z J , DING F Q , ZHONG H T , et al . RaTrack: Moving object detection and tracking with 4D radar point cloud [C ] // 2024 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2024 : 4480 - 4487 .

PAEK D H , KONG S H , KEVIN T W . K-Radar: 4D radar object detection for autonomous driving in various weather conditions [EB/OL ] . ( 2023-11-07 )[ 2025-03-11 ] . https://arxiv.org/abs/2206.08171 https://arxiv.org/abs/2206.08171 .

CHEN X Z , MA H M , WAN J , et al . Multi-view 3D object detection network for autonomous driving [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 6526 - 6534 .

KU J , MOZIFIAN M , LEE J , et al . Joint 3D proposal generation and object detection from view aggregation [C ] // 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . New York : ACM , 2018 : 1 - 8 .

BAI X Y , HU Z Y , ZHU X G , et al . TransFusion: Robust LiDAR-camera fusion for 3D object detection with transformers [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 1080 - 1089 .

LU H H , CHEN X S , ZHANG G Y , et al . Scanet: Spatial-channel attention network for 3D object detection [C ] // ICASSP 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2019 : 1992 - 1996 .

SINDAGI V A , ZHOU Y , TUZEL O . MVX-net: Multimodal VoxelNet for 3D object detection [C ] // 2019 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2019 : 7276 - 7282 .

YOO J H , KIM Y , KIM J , et al . 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection [M ] // Computer Vision-ECCV 2020 . Cham : Springer International Publishing , 2020 : 720 - 736 .

LIANG T , XIE H , YU K , et al . BEVfusion: A simple and robust lidar-camera fusion framework [C ] // Advances in Neural Information Processing Systems . Virtual : PMLR , 2022 : 10421 - 10434 .

LI Y W , CHEN Y L , QI X J , et al . Unifying voxel-based representation with transformer for 3D object detection [C ] // Proceedings of the 36th International Conference on Neural Information Processing Systems . New York : ACM , 2022 : 18442 - 18455 .

JIAO Y , JIE Z Q , CHEN S X , et al . MSMDFusion: Fusing LiDAR and camera at multiple scales with multi-depth seeds for 3D object detection [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2023 : 21643 - 21652 .

LI X , MA T , HOU Y N , et al . LoGoNet: Towards accurate 3D object detection with local-to-global cross- modal fusion [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2023 : 17524 - 17534 .

SONG Z Y , ZHANG G X , XIE J , et al . VoxelNextFusion: A simple, unified, and effective voxel fusion framework for multimodal 3-D object detection [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2023 , 61 : 5705412 .

XIANG X , ZHANG J . FusionViT: hierarchical 3D object detection via lidar-camera vision transformer fusion [C ] // International Conference on Learning Representations . Britain : ML Research Press , 2024 : 1 - 16 .

XU D F , ANGUELOV D , JAIN A . PointFusion: Deep sensor fusion for 3D bounding box estimation [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 244 - 253 .

HUANG T T , LIU Z , CHEN X W , et al . EPNet: Enhancing point features with image semantics for 3D object detection [M ] // Computer Vision-ECCV 2020 . Cham : Springer International Publishing , 2020 : 35 - 52 .

XIE L , XIANG C , YU Z X , et al . PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 7 ): 12460 - 12467 .

WANG C W , MA C , ZHU M , et al . PointAugmenting: Cross-modal augmentation for 3D object detection [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 11789 - 11798 .

LIU Z , HUANG T T , LI B L , et al . EPNet++: Cascade bi-directional fusion for multi-modal 3D object detection [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 7 ): 8324 - 8341 .

JOHN V , MITA S . RVNet: Deep sensor fusion of monocular camera and radar for image-based obstacle detection in challenging environments [M ] // Image and Video Technology . Cham : Springer International Publishing , 2019 : 351 - 364 .

JOHN V , NITHILAN M K , MITA S , et al . SO-Net: Joint semantic segmentation and obstacle detection using deep fusion of monocular camera and radar [M ] // Image and Video Technology . Cham : Springer International Publishing , 2020 : 138 - 148 .

BANSAL K , RUNGTA K , BHARADIA D . RadSegNet: A reliable approach to radar camera fusion [EB/OL ] . ( 2022-08-08 )[ 2025-02-26 ] . https://arxiv.org/pdf/2208.03849 https://arxiv.org/pdf/2208.03849 .

KIM Y , SHIN J , KIM S , et al . CRN: Camera radar net for accurate, robust, efficient 3D perception [C ] // 2023 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2023 : 17569 - 17580 .

LONG Y F , KUMAR A , MORRIS D , et al . RADIANT: Radar-image association network for 3D object detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 2 ): 1808 - 1816 .

LLIN Z W , LIU Z , XIA Z Y , et al . RCBEVDet: Radar-camera fusion in bird’s eye view for 3D object detection [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 14928 - 14937 .

NABATI R , QI H R . CenterFusion: Center-based radar and camera fusion for 3D object detection [C ] // 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2021 : 1526 - 1535 .

KIM Y , KIM S , CHOI J W , et al . CRAFT: Camera-radar 3D object detection with spatio-contextual fusion transformer [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 1 ): 1160 - 1168 .

HWANG J J , KRETZSCHMAR H , MANELA J , et al . CramNet: Camera-radar fusion with ray-constrained cross-attention for robust 3D object detection [M ] // Computer Vision-ECCV 2022 . Cham : Springer Nature Switzerland , 2022 : 388 - 405 .

ZHOU T H , CHEN J J , SHI Y N , et al . Bridging the view disparity between radar and camera features for multi-modal fusion 3D object detection [J ] . IEEE Transactions on Intelligent Vehicles , 2023 , 8 ( 2 ): 1523 - 1535 .

YANG B , GUO R S , LIANG M , et al . RadarNet: Exploiting radar for robust perception of dynamic objects [M ] // Computer Vision-ECCV 2020 . Cham : Springer International Publishing , 2020 : 496 - 512 .

WANG Y J , DENG J J , LI Y , et al . Bi-LRFusion: Bi-directional LiDAR-radar fusion for 3D dynamic object detection [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2023 : 13394 - 13403 .

BANG G , CHOI K , KIM J , et al . RadarDistill: Boosting radar-based object detection performance via knowledge distillation from LiDAR features [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) . Piscataway : IEEE , 2024 : 15491 - 15500 .

WANG L , ZHANG X Y , LI J , et al . Multi-modal and multi-scale fusion 3D object detection of 4D radar and LiDAR for autonomous driving [J ] . IEEE Transactions on Vehicular Technology , 2023 , 72 ( 5 ): 5628 - 5641 .

ZHENG L Q , LI S , TAN B , et al . RCFusion: Fusing 4-D radar and camera with bird’s-eye view features for 3-D object detection [J ] . IEEE Transactions on Instrumentation and Measurement , 2023 , 72 : 8503814 .

XIONG W Y , LIU J N , HUANG T , et al . LXL: LiDAR excluded lean 3D object detection with 4D imaging radar and camera fusion [C ] // 2024 IEEE Intelligent Vehicles Symposium (IV) . Piscataway : IEEE , 2024 : 79 - 92 .

PANG S , MORRIS D , RADHA H , et al . CLOCs: Camera-LiDAR object candidates fusion for 3D object detection [C ] // 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . New York : ACM , 2020 : 10386 - 10393 .

DONG X , ZHUANG B N , MAO Y X , et al . Radar camera fusion via representation learning in autonomous driving [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2021 : 1672 - 1681 .

GEIGER A , LENZ P , URTASUN R . Are we ready for autonomous driving? The KITTI vision benchmark suite [C ] // 2012 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2012 : 3354 - 3361 .

CAESAR H , BANKITI V , LANG A H , et al . nuScenes: A multimodal dataset for autonomous driving [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . IEEE , 2020 : 11621 - 11631 .

SUN P , KRETZSCHMAR H , DOTIWALLA X , et al . Scalability in perception for autonomous driving: Waymo open dataset [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . IEEE , 2020 : 2446 - 2454 .

PALFFY A , POOL E , BARATAM S , et al . Multi-class road user detection with 3+1D radar in the view-of-delft dataset [J ] . IEEE Robotics and Automation Letters , 2022 , 7 ( 2 ): 4961 - 4968 .

ZHENG L Q , MA Z X , ZHU X C , et al . TJ4DRadSet: A 4D radar dataset for autonomous driving [C ] // 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC) . Piscataway : IEEE , 2022 : 493 - 498 .

THOMAS H , QI C R , DESCHAUD J E , et al . KPConv: Flexible and deformable convolution for point clouds [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . IEEE , 2019 : 6411 - 6420 .

YU X M , TANG L L , RAO Y M , et al . Point-BERT: Pre-training 3D point cloud transformers with masked point modeling [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 19291 - 19300 .

ZHAO H S , JIANG L , JIA J Y , et al . Point transformer [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 16239 - 16248 .

REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: Towards real-time object detection with region proposal networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 .

BENGIO Y , GOODFELLOW I , COURVILLE A . Deep Learning [M ] . Cambridge : MIT press , 2017 : 239 .

ZHOU D F , FANG J , SONG X B , et al . IoU loss for 2D/3D object detection [C ] // 2019 International Conference on 3D Vision (3DV) . Piscataway : IEEE , 2019 : 85 - 94 .

SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vision [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 2818 - 2826 .

KINGMA DIEDERIK P. , JIMMY BA . Adam: A method for stochastic optimization [C ] // International Conference for Learning Represe-ntations . California : ML Research Press , 2014 : 1 - 15 .

TIELEMAN T , HINTON G . Divide the gradient by a running average of its recent magnitude . Neural networks for machine learning[EB/OL ] . ( 2012-09-14 )[ 2025-02-26 ] . https://www.coursera.org/learn/neural-networks-deep-learning https://www.coursera.org/learn/neural-networks-deep-learning .

LOSHCHILOV I . Decoupled weight decay regularization [C ] // In International Conference on Learning Representations . UK : ML Research Press , 2019 : 1 - 15 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

A Multi-Modal Medical Image Analysis Algorithm Based on Text Guidance

3D Object Detection Based on Feature Distribution Convergence Guided by LiDar Point Cloud and Semantic Association

Motion Compensation Optimization Method for 3D Multi-Object Tracking

Related Author

ZHAO Tie-song

CHEN Jian

GONG Xun

FAN Lin

ZHENG Cen-yang

GONG Xun

FAN Lin

Related Institution

Fujian Key Lab for Intelligent Processing and Wireless Transmission of Media Information

Engineering Research Center of Sustainable Urban Intelligent Transportation， Ministry of Education

School of Computing and Artificial Intelligence， Southwest Jiaotong University

National Engineering Laboratory of Integrated Transportation Big Data Application Technology

Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province

⁰