沈阳工业大学信息科学与工程学院,辽宁沈阳 110870
[ "桑海峰 男,1978年1月出生于辽宁省沈阳市.现为沈阳工业大学视觉检测研究所教授、博士生导师.主要研究方向为机器视觉检测技术和智能视频分析技术.E‑mail: sanghaif@163.com" ]
[ "王金玉(通讯作者) 女,1996年5月出生于辽宁省盖州市.现为沈阳工业大学视觉检测研究所博士.主要研究方向为行人轨迹预测.E‑mail: 1911131982@qq.com" ]
[ "陈旺兴 男,1998年3月出生于江西省抚州市.现为沈阳工业大学视觉检测研究所硕士.主要研究方向为行人轨迹预测.E‑mail: 1909703861@qq.com" ]
[ "王海峰 男,1995年3月出生于吉林省吉林市.现为沈阳工业大学视觉检测研究所硕士.主要研究方向为行人轨迹预测.E‑mail: 798466420@qq.com" ]
收稿:2021-10-29,
修回:2021-12-27,
纸质出版:2023-05-25
移动端阅览
桑海峰,王金玉,陈旺兴等.基于第一视角的非自回归行人轨迹预测模型[J].电子学报,2023,51(05):1266-1272.
SANG Hai-feng,WANG Jin-yu,CHEN Wang-xing,et al.Non-Autoregressive Pedestrian Trajectory Prediction Model Based on the First Perspective[J].ACTA ELECTRONICA SINICA,2023,51(05):1266-1272.
桑海峰,王金玉,陈旺兴等.基于第一视角的非自回归行人轨迹预测模型[J].电子学报,2023,51(05):1266-1272. DOI: 10.12263/DZXB.20211467.
SANG Hai-feng,WANG Jin-yu,CHEN Wang-xing,et al.Non-Autoregressive Pedestrian Trajectory Prediction Model Based on the First Perspective[J].ACTA ELECTRONICA SINICA,2023,51(05):1266-1272. DOI: 10.12263/DZXB.20211467.
行人轨迹预测在自动驾驶和监控系统等多个应用中具有重要意义.目前大多数行人轨迹预测模型采用基于循环神经网络的编码器-解码器结构,其自回归的解码结构存在一定的累积误差,而且循环神经网络对序列的长期依赖问题仍然无法很好地解决.本文提出一种基于Transformer网络的非自回归行人轨迹预测模型,非自回归的解码结构能够同时生成所有预测值来减少累积误差,Transformer网络中的自注意力机制能够改善长期依赖问题.本文还设计一个局部信息加强模块来捕获行人运动趋势发生变化的局部特征,同时结合边界框的位置信息和大小信息来编码第一视角下透视投影产生的影响,使得模型提取到的轨迹特征更加有效.实验结果表明,在基于第一视角的公开数据集PIE(Pedestrian Intention Estimation)上,本文提出的模型比PIE预测模型在15、30、45帧的平均位移误差和终点位移误差上分别降低了24%,14.5%,11%和6%.
Pedestrian trajectory prediction plays an important role in many applications such as automatic driving and monitoring systems. At present
most pedestrian trajectory prediction models are recurrent neural network (RNN) based on encoder-decoder architectures. RNN could not solve the long-term dependence
and its auto-regressive decoding scheme introduces accumulate errors. This paper proposes a Transformer based non-autoregressive pedestrian trajectory prediction model
whose non-autoregressive decoder can generate all predictions simultaneously to reduce accumulative errors. The self-attention mechanism can enhance the long-term dependence problem. More specifically
this paper designs a local information enhancement module to extract the local features when pedestrian's movement trend changes
and combining with the location information and scale of the boundary encodes the impact of perspective projection in the first perspective
which makes the trajectory features extracted from the model more efficient. Experimental results show that
compared with the PIE (Pedestrian Intention Estimation) model
the average displacement error of 15
30 and 45 frame and the end displacement error are respectively reduced by 24%
14.5%
11% and 6% on a public data set PIE based on the first perspective.
HOCHREITER S , SCHMIDHUBER J . Long short-term memory [J]. Neural Computation , 1997 , 9 ( 8 ): 1735 - 1780 .
CHO K , MERRIENBOER B VAN , GULCEHRE C , et al . Learning phrase representations using RNN encoder-decoder for statistical machine translation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014) . Stroudsburg : Association for Computational Linguistics , 2014 : 1724 - 1734 .
ALAHI A , GOEL K , RAMANATHAN V , et al . Social LSTM: human trajectory prediction in crowded spaces [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 961 - 971 .
ZHANG P , OUYANG W L , ZHANG P F , et al . SR-LSTM: State refinement for LSTM towards pedestrian trajectory prediction [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 12077 - 12086 .
XUE H , HUYNH D Q , REYNOLDS M . SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction [C]// 2018 IEEE Winter Conference on Applications of Computer Vision(WACV) . Piscataway : IEEE , 2018 : 1186 - 1194 .
LI J C , MA H B , TOMIZUKA M . Conditional generative neural system for probabilistic trajectory prediction [C]// 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 6150 - 6156 .
XUE H , HUYNH D Q , REYNOLDS M . A location-velocity-temporal attention LSTM model for pedestrian trajectory prediction [J]. IEEE Access , 2020 , ( 8 ): 44576 - 44589 .
LIANG J W , JIANG L , NIEBLES J C , et al . Peeking into the future: Predicting future person activities and locations in videos [C]// 2019 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 5718 - 5727 .
CUI H G , RADOSAVLJEVIC V , CHOU F C , et al . Multimodal trajectory predictions for autonomous driving using deep convolutional networks [C]// 2019 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2019 : 2090 - 2096 .
RASOULI A , KOTSERUBA I , KUNIC T , et al . PIE: A large-scale dataset and models for pedestrian intention estimation and trajectory prediction [C]// 2019 Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 6261 - 6270 .
GIULIARI F , HASAN I , CRISTANI M , et al . Transformer networks for trajectory forecasting [C]// 2020 25th International Conference on Pattern Recognition (ICPR) . Piscataway : IEEE , 2021 : 10335 - 10342 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C]// Advances in Neural Information Processing Systems . Cambridge : MIT Press , 2017 : 5998 - 6008 .
LI B , TIAN J , ZHANG Z F , et al . Multitask non-autoregressive model for human motion prediction [J]. IEEE Transactions on Image Processing , 2020 , 30 : 2562 - 2574 .
LI S , JIN X , XUAN Y , et al . Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting [J]. Advances in Neural Information Processing Systems , 2019 , ( 32 ): 5243 - 5253 .
YAO Y , XU M Z , CHOI C , et al . Egocentric vision-based future vehicle localization for intelligent driving assistance systems [C]// 2019 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2019 : 9711 - 9717 .
YAGI T , MANGALAM K , YONETANI R , et al . Future person localization in first-person videos [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7593 - 7602 .
白萧 . 第一视角下的行人轨迹预测方法研究 [D]. 大连 : 大连海事大学 , 2020 .
KALMAN R E . A new approach to linear filtering and prediction problems [J]. Journal of Basic Engineering , 1960 , 82 ( 1 ): 35 - 45 .
BHATTACHARYYA A , FRITZ M , SCHIELE B . Long-term on-board prediction of people in traffic scenes under uncertainty [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4194 - 4202 .
0
浏览量
12
下载量
1
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621