面向大模型预训练的多模态行人轨迹预测隐私保护方案

魏建好; 周渟森; 李闯; 文艳华; 李克勤

doi:10.12263/DZXB.20250638

您当前的位置：

首页 >

文章列表页 >

面向大模型预训练的多模态行人轨迹预测隐私保护方案

学术论文 | 更新时间：2026-04-24

- 面向大模型预训练的多模态行人轨迹预测隐私保护方案
- Privacy-Preserving Multimodal Pedestrian Trajectory Predictio Scheme for Large Model Pre-Training
- 电子学报 2025年53卷第12期页码：4376-4393
- 作者机构：
  
  1.湖南工商大学前沿交叉学院，湖南长沙 410205
  2.湖南工商大学计算机学院，湖南长沙 410205
  3.湖南大学信息科学与工程学院，湖南长沙 410082
  4.纽约州立大学计算机学院，美国纽约 12561
- 作者简介：
  
  [ "魏建好男，1989年8月出生于河南省信阳市.现为湖南工商大学副教授.主要研究方向为人工智能安全.E-mail: jianhao@hutb.edu.cn" ]
  [ "周渟森男，2000年11月出生于广西壮族自治区梧州市.现为湖南工商大学在读研究生.主要研究方向为智慧交通安全预测.E-mail: zhoutingsen666@163.com" ]
  [ "李闯男，1990年11月出生于湖南省湘乡市.现为湖南工商大学副教授.主要研究方向为高性能计算、人工智能. E-mail: chuangli@hutb.edu.cn" ]
  [ "文艳华女，1985年9月出生于湖南省益阳市.现为湖南工商大学副教授.主要研究方向为联邦学习. E-mail:yanhua-wen@hutb.edu.cn" ]
  [ "李克勤男，1963年5月出生于上海市. 现为湖南大学教授. 主要研究方向为并行计算、边缘计算、云计算. E-mail:likq@hnu.edu.cn" ]
- 基金信息：
  
  国家自然科学基金(62402177);湖南省自然科学基金(2023JJ40237);湖南省教育厅优秀青年基金(22B0648);湘江实验室重大项目(23XJ01002;24XJJCYJ01003;22XJ01001)
- DOI：10.12263/DZXB.20250638
  中图分类号： TP183;
- 收稿：2025-07-21，
  
  录用：2025-12-01，
  
  纸质出版：2025-12-25
- 稿件说明：
移动端阅览
魏建好, 周渟森, 李闯, 等. 面向大模型预训练的多模态行人轨迹预测隐私保护方案[J]. 电子学报, 2025, 53(12): 4376-4393.

WEI Jian-hao, ZHOU Ting-sen, LI Chuang, et al. Privacy-Preserving Multimodal Pedestrian Trajectory Predictio Scheme for Large Model Pre-Training[J]. Acta Electronica Sinica, 2025, 53(12): 4376-4393.
魏建好, 周渟森, 李闯, 等. 面向大模型预训练的多模态行人轨迹预测隐私保护方案[J]. 电子学报, 2025, 53(12): 4376-4393. DOI：10.12263/DZXB.20250638

WEI Jian-hao, ZHOU Ting-sen, LI Chuang, et al. Privacy-Preserving Multimodal Pedestrian Trajectory Predictio Scheme for Large Model Pre-Training[J]. Acta Electronica Sinica, 2025, 53(12): 4376-4393. DOI：10.12263/DZXB.20250638

摘要

在城市级交通大模型应用中，稀疏、异构及强时空关联的多模态行人轨迹数据面临大模型预训练的隐私安全问题.然而，现有大模型隐私保护方法主要关注单一图像、文本或轨迹模态进行隐私保护，忽视了多模态之间在融合空间中的高维相关结构以及梯度中隐含的跨模态语义泄露风险，容易在模型反推或重构攻击下暴露用户真实轨迹模式和行为偏好，难以有效保护多模态融合数据和模型梯度关联性隐私.此外，现有大模型注意力机制主要针对密集数据，难以高效处理稀疏的多模态交通数据，导致模型预测精度不高.因此，本文提出了一种面向大模型预训练的多模态行人轨迹预测隐私保护方案（Privacy-preserving Multimodal Pedestrian Trajectory prediction scheme for Large model pre-training，PMPTL），实现了多模态数据和预训练模型的双重高效保护和高精度预测.具体而言，创新的设计基于Transformer与Mamba相结合的多模态稀疏轨迹流融合方法（Multimodal Sparse trajectory flow fusion method based on a combination of Transformer and Mamba，MSTM），采用Transformer机制对行人轨迹序列进行全局依赖建模，引入Mamba机制降低长序列建模的复杂度，高效融合稀疏时空特征.其次，提出基于分辨率网格划分的自适应加权差分隐私方法（Resolution-aware Grid partitioning-based Adaptive weighted Differential Privacy method，RGADP），根据网格分辨率和网格轨迹特征密度动态分配隐私预算，高可用保护融合特征隐私.接着，提出基于双分支自适应稀疏自注意力机制的多模态特征增强算法（Dual-Branch Adaptive Sparse self-attention mechanism，DBAS），设计双分支自注意力机制，动态调整权重以增强稀疏数据特征表征，确保大模型在稀疏场景下高效表征稀疏轨迹的关键特征，提升预训练效率.同时，采用自适应时空Top-K稀疏化的高效抖动量化隐私保护方法（Adaptive Spatiotemporal Top-K sparsification with Dithering Quantization method，ASDQ），减少梯度冗余，确保大模型预训练安全性.最后，基于自适应加权聚合的多模态稀疏行人轨迹预测优化方法（Adaptive Weighted aggregation-based Multimodal sparse Trajectory prediction method，AWMT），对不同模型参数进行动态加权，平衡隐私保护强度与行人轨迹预测精度，以实现高精度轨迹预测.通过理论分析论证了本文方案满足

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=106750896&type=

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=106750910&type=

1.43933344

2.28600001

-差分隐私保护.在真实数据集上的实验结果表明，本文方案的预测误差较现有先进方法降低10%，通信效率提升18.43%.

Abstract

Multimodal pedestrian trajectory prediction in city-scale traffic models faces critical challenges including sparse heterogeneous data with strong spatiotemporal correlations and privacy risks during large model pre-training. However

existing privacy-preserving methods for large models predominantly focus on protecting a single modality

such as images

text

or trajectories

while neglecting the high-dimensional correlation structures among modalities in the fusion space and the risk of cross-modal semantic leakage embedded in the gradients. As a result

these methods are vulnerable to model inversion and reconstruction attacks that can expose users’ real trajectory patterns and behavioral preferences

and they fail to effectively protect the privacy of both multimodal fused data and gradient correlations. Moreover

conventional attention mechanisms designed for dense data struggle to efficiently process sparse multimodal traffic features

resulting in suboptimal prediction accuracy. To address these issues

this paper proposes a privacy-preserving multimodal pedestrian trajectory prediction scheme for large model pre-training (PMPTL)

achieving dual-efficient protection for both multimodal data and pre-trained models

along with high-accuracy prediction. Specifically

we design an innovative multimodal sparse trajectory flow fusion method based on a combination of Transformer and Mamba (MSTM)

where the Transformer mechanism models global dependencies in pedestrian trajectory sequences and the Mamba mechanism is introduced to reduce the complexity of long-sequence modeling

thereby enabling efficient fusion of sparse spatiotemporal features. Secondly

we propose a resolution-aware grid partitioning-based adaptive weighted differential privacy (RGADP) method

which dynamically allocates privacy budgets according to grid resolution and the density of grid-level trajectory features

thereby achieving high-utility protection of fused feature privacy. Next

we propose a m

ultimodal feature enhancement algorithm based on a dual-branch adaptive sparse self-attention mechanism (DBAS). By designing a dual-branch self-attention structure that dynamically adjusts weights to strengthen the representation of sparse data features

DBAS enables the large model to efficiently capture key characteristics of sparse trajectories in sparse scenarios and thereby improves pre-training efficiency. Additionally

an adaptive spatiotemporal Top-K sparsification with dithering quantization (ASDQ) method is introduced to reduce gradient redundancy and ensure secure model training. Finally

we propose an adaptive weighted aggregation-based multimodal sparse trajectory prediction framework (AWMT)

which dynamically re-weights different model parameters to balance the strength of privacy protection and the accuracy of pedestrian trajectory prediction

thereby achieving high-precision trajectory forecasting. Theoretical analysis demonstrates that our scheme satisfies

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=106750923&type=

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=106750911&type=

1.43933344

2.28600001

-DP protection. Experimental results on two real-world datasets show that our scheme reduces prediction error by 10% compared to state-of-the-art approaches and improves communication efficiency by 18.43%.

关键词

Keywords

references

BAE I , PARK Y J , JEON H G . SingularTrajectory: Universal trajectory predictor using diffusion model [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 17890 - 17901 .

SAADATNEJAD S , GAO Y , MESSAOUD K , et al . Social-transmotion: Promptable human trajectory prediction [EB/OL ] . ( 2024-12-04 )[ 2025-11-10 ] . https://arXiv.org/abs/2312.16168 https://arXiv.org/abs/2312.16168 .

MARCHETTI F , BECATTINI F , SEIDENARI L , et al . SMEMO: Social memory for trajectory forecasting [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2024 , 46 ( 6 ): 4410 - 4425 .

崔双双 , 吴限 , 王宏志 , 等 . 面向云边端协同的多模态数据建模技术及其应用 [J ] . 软件学报 , 2024 , 35 ( 3 ): 1154 - 1172 .

CUI S S , WU X , WANG H Z , et al . Multimodal data modeling technology and its application for cloud-edge-device collaboration [J ] . Journal of Software , 2024 , 35 ( 3 ): 1154 - 1172 . (in Chinese)

CHEN J J , HU C Q , SHENG W H , et al . Sensitivity-aware personalized differential privacy guarantees for online social networks [J ] . IEEE Transactions on Information Forensics and Security , 2025 , 20 : 3116 - 3130 .

LIU Y H , WANG T H , LIU Y X , et al . Edge-protected triangle count estimation under relationship local differential privacy [J ] . IEEE Transactions on Knowledge and Data Engineering , 2024 , 36 ( 10 ): 5138 - 5152 .

ZILBERMAN A , DVIR A , STULMAN A . SPRINKLER: A multi-RPL man-in-the-middle identification scheme in IoT networks [J ] . IEEE Transactions on Mobile Computing , 2024 , 23 ( 10 ): 9971 - 9988 .

康海燕 , 王骁识 . 基于数据特征相关性和自适应差分隐私的深度学习方法研究 [J ] . 电子学报 , 2024 , 52 ( 6 ): 1963 - 1976 .

KANG H Y , WANG X S . Research on the deep learning method based on data feature relevance and adaptive differential privacy [J ] . Acta Electronica Sinica , 2024 , 52 ( 6 ): 1963 - 1976 . (in Chinese)

李森森 , 刘燕江 , 郁滨 , 等 . 边缘计算环境下基于PUF的多接收者匿名签密方案 [J ] . 电子学报 , 2024 , 52 ( 12 ): 4087 - 4100 .

LI S S , LIU Y J , YU B , et al . PUF-based multi-receiver anonymous signcryption scheme in edge computing [J ] . Acta Electronica Sinica , 2024 , 52 ( 12 ): 4087 - 4100 . (in Chinese)

赵琪 , 樊婷 , 韦永壮 . 基于MILP对轻量级密码算法FBC-128的差分分析 [J ] . 电子学报 , 2024 , 52 ( 6 ): 1896 - 1902 .

ZHAO Q , FAN T , WEI Y Z . MILP-based differential cryptanalysis of the FBC-128 lightweight cipher [J ] . Acta Electronica Sinica , 2024 , 52 ( 6 ): 1896 - 1902 . (in Chinese)

WU L , QIN C Y , XU Z H , et al . TCPP: Achieving privacy-preserving trajectory correlation with differential privacy [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 4006 - 4020 .

LUO J , REN W Q , GAO X W , et al . Multi-exposure image fusion via deformable self-attention [J ] . IEEE Transactions on Image Processing , 2023 , 32 : 1529 - 1540 .

LIU J , CHEN S H , HE X J , et al . VALOR: Vision-audio-language omni-perception pretraining model and dataset [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2025 , 47 ( 2 ): 708 - 724 .

ZHOU S H , CHEN D S , PAN J S , et al . Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 2952 - 2963 .

CHEN X Y , CHENG Z H , CAI H Q , et al . Laplacian convolutional representation for traffic time series imputation [J ] . IEEE Transactions on Knowledge and Data Engineering , 2024 , 36 ( 11 ): 6490 - 6502 .

WANG G , QI Q , HAN R , et al . P2CEFL: Privacy-preserving and communication efficient federated learning with sparse gradient and dithering quantization [J ] . IEEE Transactions on Mobile Computing , 2024 , 23 ( 12 ): 14722 - 14736 .

WEI K , LI J , MA C , et al . Personalized federated learning with differential privacy and convergence guarantee [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 4488 - 4503 .

BULATOV A , KURATOV Y , KAPUSHEV Y , et al . Beyond attention: Breaking the limits of transformer context length with recurrent memory [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 16 ): 17700 - 17708 .

ZENG Y , ZHANG X S , LI H , et al . X 2 -VLM: All-in-one pre-trained model for vision-language tasks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2024 , 46 ( 5 ): 3156 - 3168 .

桑海峰 , 王金玉 , 陈旺兴 , 等 . 基于第一视角的非自回归行人轨迹预测模型 [J ] . 电子学报 , 2023 , 51 ( 5 ): 1266 - 1272 .

SANG H F , WANG J Y , CHEN W X , et al . Non-autoregressive pedestrian trajectory prediction model based on the first perspective [J ] . Acta Electronica Sinica , 2023 , 51 ( 5 ): 1266 - 1272 . (in Chinese)

YUE G F , YAN L , KANG L W , et al . AdapLDP-FL: An adaptive local differential privacy for federated learning [J ] . IEEE Transactions on Mobile Computing , 2025 , 24 ( 6 ): 5569 - 5583 .

FUKAMI T , MURATA T , NIWA K T , et al . DP-norm: Differential privacy primal-dual algorithm for decentralized federated learning [J ] . IEEE Transactions on Information Forensics and Security , 2024 , 19 : 5783 - 5797 .

DINH V , HO L , NGUYEN C . Hamiltonian Monte Carlo on ReLU neural networks is inefficient [C ] // Advances in Neural Information Processing Systems 37 . San Diego : NeurIPS Inc. , 2024 : 134107 - 134126 .

SHI L S , WANG L , ZHOU S P , et al . Trajectory unified transformer for pedestrian trajectory prediction [C ] // 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2024 : 9641 - 9650 .

XU C X , TAN R T , TAN Y H , et al . EqMotion: Equivariant multi-agent motion prediction with invariant interaction reasoning [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 1410 - 1420 .

MAO W B , XU C X , ZHU Q , et al . Leapfrog diffusion model for stochastic trajectory prediction [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 5517 - 5526 .

BAE I , OH J , JEON H G . EigenTrajectory: Low-rank descriptors for multi-modal trajectory forecasting [C ] // 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2024 : 9983 - 9995 .

DONG W H , ZHU H D , LIN S H , et al . Fusion-mamba for cross-modality object detection [EB/OL ] . ( 2024-04-14 )[ 2025-11-10 ] . https://arXiv.org/abs/2404.09146 https://arXiv.org/abs/2404.09146 .

KHAN M , AHMAD J , EL SADDIK A , et al . Drone-HAT: Hybrid attention transformer for complex action recognition in drone surveillance videos [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Piscataway : IEEE , 2024 : 4713 - 4722 .

DAI G H , LU P , NING X F , et al . DiTFastAttn: Attention compression for diffusion transformer models [C ] // Advances in Neural Information Processing Systems 37 . San Diego : eurIPS Inc. , 2024 : 1196 - 1219 .

XU C X , MAO W B , ZHANG W J , et al . Remember intentions: Retrospective-memory-based trajectory prediction [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2022 : 6478 - 6487 .

HASıRCıOĞLU B , GÜNDÜZ D . Communication efficient private federated learning using dithering [C ] // 2024 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2024 : 7575 - 7579 .

TRAN P , WU H R , YU C J , et al . What truly matters in trajectory prediction for autonomous driving [EB/OL ] . ( 2023-11-6 )[ 2025-11-10 ] . https://arXiv.org/abs/2306.15136 https://arXiv.org/abs/2306.15136 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

分级包络对抗域适应和松-紧耦合行人轨迹预测模型

基于分级包络域适应的行人轨迹预测模型

基于第一视角的非自回归行人轨迹预测模型

基于多模式时空交互的行人轨迹预测模型