SANG Hai-feng,WANG Jin-yu,CHEN Wang-xing,et al.Non-Autoregressive Pedestrian Trajectory Prediction Model Based on the First Perspective[J].ACTA ELECTRONICA SINICA,2023,51(05):1266-1272.
SANG Hai-feng,WANG Jin-yu,CHEN Wang-xing,et al.Non-Autoregressive Pedestrian Trajectory Prediction Model Based on the First Perspective[J].ACTA ELECTRONICA SINICA,2023,51(05):1266-1272. DOI: 10.12263/DZXB.20211467.
Non-Autoregressive Pedestrian Trajectory Prediction Model Based on the First Perspective
Pedestrian trajectory prediction plays an important role in many applications such as automatic driving and monitoring systems. At present
most pedestrian trajectory prediction models are recurrent neural network (RNN) based on encoder-decoder architectures. RNN could not solve the long-term dependence
and its auto-regressive decoding scheme introduces accumulate errors. This paper proposes a Transformer based non-autoregressive pedestrian trajectory prediction model
whose non-autoregressive decoder can generate all predictions simultaneously to reduce accumulative errors. The self-attention mechanism can enhance the long-term dependence problem. More specifically
this paper designs a local information enhancement module to extract the local features when pedestrian's movement trend changes
and combining with the location information and scale of the boundary encodes the impact of perspective projection in the first perspective
which makes the trajectory features extracted from the model more efficient. Experimental results show that
compared with the PIE (Pedestrian Intention Estimation) model
the average displacement error of 15
30 and 45 frame and the end displacement error are respectively reduced by 24%
14.5%
11% and 6% on a public data set PIE based on the first perspective.
CHO K , MERRIENBOER B VAN , GULCEHRE C , et al . Learning phrase representations using RNN encoder-decoder for statistical machine translation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014) . Stroudsburg : Association for Computational Linguistics , 2014 : 1724 - 1734 .
ALAHI A , GOEL K , RAMANATHAN V , et al . Social LSTM: human trajectory prediction in crowded spaces [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 961 - 971 .
ZHANG P , OUYANG W L , ZHANG P F , et al . SR-LSTM: State refinement for LSTM towards pedestrian trajectory prediction [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 12077 - 12086 .
XUE H , HUYNH D Q , REYNOLDS M . SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction [C]// 2018 IEEE Winter Conference on Applications of Computer Vision(WACV) . Piscataway : IEEE , 2018 : 1186 - 1194 .
LI J C , MA H B , TOMIZUKA M . Conditional generative neural system for probabilistic trajectory prediction [C]// 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 6150 - 6156 .
XUE H , HUYNH D Q , REYNOLDS M . A location-velocity-temporal attention LSTM model for pedestrian trajectory prediction [J]. IEEE Access , 2020 , ( 8 ): 44576 - 44589 .
LIANG J W , JIANG L , NIEBLES J C , et al . Peeking into the future: Predicting future person activities and locations in videos [C]// 2019 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 5718 - 5727 .
CUI H G , RADOSAVLJEVIC V , CHOU F C , et al . Multimodal trajectory predictions for autonomous driving using deep convolutional networks [C]// 2019 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2019 : 2090 - 2096 .
RASOULI A , KOTSERUBA I , KUNIC T , et al . PIE: A large-scale dataset and models for pedestrian intention estimation and trajectory prediction [C]// 2019 Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 6261 - 6270 .
GIULIARI F , HASAN I , CRISTANI M , et al . Transformer networks for trajectory forecasting [C]// 2020 25th International Conference on Pattern Recognition (ICPR) . Piscataway : IEEE , 2021 : 10335 - 10342 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C]// Advances in Neural Information Processing Systems . Cambridge : MIT Press , 2017 : 5998 - 6008 .
LI B , TIAN J , ZHANG Z F , et al . Multitask non-autoregressive model for human motion prediction [J]. IEEE Transactions on Image Processing , 2020 , 30 : 2562 - 2574 .
LI S , JIN X , XUAN Y , et al . Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting [J]. Advances in Neural Information Processing Systems , 2019 , ( 32 ): 5243 - 5253 .
YAO Y , XU M Z , CHOI C , et al . Egocentric vision-based future vehicle localization for intelligent driving assistance systems [C]// 2019 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2019 : 9711 - 9717 .
YAGI T , MANGALAM K , YONETANI R , et al . Future person localization in first-person videos [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7593 - 7602 .
白萧 . 第一视角下的行人轨迹预测方法研究 [D]. 大连 : 大连海事大学 , 2020 .
KALMAN R E . A new approach to linear filtering and prediction problems [J]. Journal of Basic Engineering , 1960 , 82 ( 1 ): 35 - 45 .
BHATTACHARYYA A , FRITZ M , SCHIELE B . Long-term on-board prediction of people in traffic scenes under uncertainty [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4194 - 4202 .