

浏览全部资源
扫码关注微信
1.北京物资学院智能工程与供应链创新学院,北京 101149
2.智能物流系统北京市重点实验室,北京 101149
3.北京工商大学计算机与人工智能学院,北京 100048
Received:12 June 2025,
Accepted:21 October 2025,
Published:25 October 2025
移动端阅览
邵舒羽, 张扬, 颜文婧. 多模态生理特征融合的驾驶行为识别研究[J]. 电子学报, 2025, 53(10): 3540-3550.
SHAO Shu-yu, ZHANG Yang, YAN Wen-jing. Driving Behavior Recognition Based on Multimodal Physiological Feature Fusion[J]. Acta Electronica Sinica, 2025, 53(10): 3540-3550.
邵舒羽, 张扬, 颜文婧. 多模态生理特征融合的驾驶行为识别研究[J]. 电子学报, 2025, 53(10): 3540-3550. DOI:10.12263/DZXB.20250506
SHAO Shu-yu, ZHANG Yang, YAN Wen-jing. Driving Behavior Recognition Based on Multimodal Physiological Feature Fusion[J]. Acta Electronica Sinica, 2025, 53(10): 3540-3550. DOI:10.12263/DZXB.20250506
传统驾驶行为识别方法存在依赖外部传感数据、易受环境干扰及难以反映驾驶员内部认知状态等局限.为此,本文构建一种多模态生理信号深度学习框架,融合Transformer与卷积神经网络(Convolutional Neural Network,CNN),以实现驾驶行为高精度识别与可解释性分析.研究依托包含脑电图(ElectroEncephaloGraphy,EEG)、心电图(ElectroCardioGraphy,ECG)、肌电图(ElectroMyoGraphy,EMG)与皮肤电导率(Galvanic Skin Response,GSR)的多模态生理数据集(Multimodal Physiological Data for Behavior recognition,MPDB),系统规划从信号预处理、特征提取至时空融合的完整流程.各模态信号通过滤波、伪迹校正、特征标准化及时频变换后同步对齐,构建时空特征张量以实现不同生理模态间的统一表示.在模型架构层面,CNN分支负责捕捉局部时空模式并提取短时响应特征,Transformer分支则凭借自注意力机制对生理信号长程依赖与跨模态交互关系建模,兼顾局部敏感性与全局时序建模能力.融合网络采用双流结构,将多头注意力与多尺度卷积相结合,并引入动态权重分配机制实现特征自适应融合.优化进程运用AdamW算法与Dropout正则化,进一步提升模型的泛化性能与收敛稳定性.实验结果表明:该模型在二分类(平稳驾驶/动态驾驶)任务中,准确率分别达到94.9%与98.7
5%;在5种驾驶行为识别(平稳驾驶、加速、减速、换道、转弯)中,模型平均准确率为85.39%,显著高于循环神经网络(Recurrent Neural Network,RNN)、长短期记忆(Long Short-Term Memory,LSTM)网络、支持向量机(Support Vector Machine,SVM)、单一CNN及单一Transformer模型,且在
F
1
分数和召回率上取得了良好的平衡,验证了其在多模态信号表征和时序依赖建模方面的综合性优势.模型训练曲线也表明:该框架收敛速度快且收敛损失值较低,体现出较强的鲁棒性与抗过拟合能力.在此基础上,为提升模型可解释性,本文引入深度SHAP(Deep SHapley Additive exPlanations,DeepSHAP)方法对所建立模型的决策过程进行特征归因分析.分析结果表明:高频脑电信号(
β
波、
γ
波)和上肢肌电信号对加速驾驶操作影响较大,而胫骨前肌活动和反应延迟对换道驾驶操作具有显著影响.本文提出的方法揭示了不同驾驶操作背后生理响应规律,为探究驾驶员神经-行为层级关系提供了新的视角.综上所述,本文提出的Transformer-CNN融合框架能够有效提取多模态生理信号的时空信息特征,在识别精度、稳定性以及解释性等性能指标方面均表现优异,不仅为智能驾驶监测系统提供实用的技术支持,还为在驾驶安全研究中多源信号建模与可解释人工智能的应用提供技术方向.下一步工作将考虑自然驾驶条件对所提方法的影响,以期其在驾驶状态实时监测、连续性风险预测中得到更广泛的应用.
Traditional driving behavior recognition methods have limitations such as relying on external sensor data
being vulnerable to environmental interference
and being difficult to reflect the internal cognitive state of drivers. To this end
this paper constructs a multimodal physiological signal deep learning framework that integrates Transformer and convolutional neural network (CNN) to achieve high-precision recognition and interpretability analysis of driving behavior. The research is based on a multimodal physiological data for behavior recognition (MPDB) containing electroencephalogram (EEG)
electrocardiogram (ECG)
electromyography (EMG)
and galvanic skin response (GSR)
and systematically plans a complete process from signal preprocessing
feature extraction to spatio-temporal fusion. After filtering
artifact correction
feature standardization and time-frequency transformation
the signals of each mode are synchronously aligned to construct a spatio-temporal feature tensor to achieve a unified representation among different physiological modes. At the model architecture level
the CNN branch is responsible for capturing local spatiotemporal patterns and extracting short-term response features
while the Transformer br
anch models the long-term dependence of physiological signals and cross-modal interaction relationships through its self-attention mechanism
taking into account both local sensitivity and global temporal modeling capabilities. The fusion network adopts a two-stream structure
combines multi-head attention with multi-scale convolution
and introduces a dynamic weight distribution mechanism to achieve feature adaptive fusion. The optimization process employs the AdamW algorithm and Dropout regularization to further enhance the generalization performance and convergence stability of the model. The experimental results show that in the binary classification (smooth driving/dynamic driving) tasks
the accuracy rates of this model reach 94.9% and 98.75% respectively. Among the five types of driving behavior recognition (smooth driving
acceleration
deceleration
lane changing
and turning)
the average accuracy rate of the model was 85.39%
significantly higher than that of recurrent neural network (RNN)
long short-term memory (LSTM) network
support vector machine (SVM)
single CNN
and single Transformer. Moreover
it achieved a good balance in
F
1
and recall rate. It has verified its comprehensive advantages in multimodal signal characterization and timing dependency modeling. The model training curve also indicates that this framework has a fast convergence speed and a low convergence loss value
suggesting that it has strong robustness and is not prone to overfitting. On this basis
in order to enhance the interpretability of the model
this paper introduces the deep SHapley additive explanations (DeepSHAP) method to conduct feature attribution analysis on the decision-making process of the established model. The analysis results show that high-frequency electroencephalographic signals (
β
waves
γ
waves) and upper limb electromyographic signals have a significant impact on accelerated driving operations
while the activity and reaction delay of the ti
bialis anterior muscle have a significant impact on lane-changing driving operations. The method proposed in this paper reveals the physiological response laws behind different driving operations
providing a new perspective for exploring the neuro-behavioral hierarchical relationship of drivers. In conclusion
this paper proposes that the Transformer-CNN fusion framework can extract the spatiotemporal information features of multimodal physiological signals quite well. It has achieved good performance in performance indicators such as recognition accuracy
stability
and interpretability
and at the same time provides applicable technical support for the constructed intelligent driving monitoring system. It also provides a technical direction for the application of multi-source signal modeling and explainable artificial intelligence in driving safety research. In the next step of work
the research on the proposed method under natural driving conditions will be considered
so as to better apply it in real-time monitoring of driving conditions and continuous risk prediction.
YAN H B . Automotive safety-assisted driving technology based on computer artificial intelligence environment [J ] . IEEJ Transactions on Electrical and Electronic Engineering , 2025 , 20 ( 4 ): 634 - 646 .
王天硕 , 高景伯 , 童盛军 , 等 . 面向不平衡数据的SMOTE-LSTM车辆事故检测方法 [J ] . 交通信息与安全 , 2025 , 43 ( 1 ): 52 - 60, 73 .
WANG T S , GAO J B , TONG S J , et al . SMOTE-LSTM vehicle accident detection method for imbalanced data [J ] . Journal of Transport Information and Safety , 2025 , 43 ( 1 ): 52 - 60, 73 . (in Chinese)
程鑫 , 周经美 , 刘霈源 , 等 . 融合注意力机制与时序特征的异常驾驶行为识别算法 [J ] . 长安大学学报(自然科学版) , 2024 , 44 ( 6 ): 103 - 113 .
CHENG X , ZHOU J M , LIU P Y , et al . Abnormal driving behavior recognition algorithm combining attention mechanism and timing features [J ] . Journal of Chang’an University (Natural Science Edition) , 2024 , 44 ( 6 ): 103 - 113 . (in Chinese)
ISHAK M F , KAMARU ZAMAN F H , MUN N K , et al . Improving night driving behavior recognition with ResNet50 [J ] . Indonesian Journal of Electrical Engineering and Computer Science , 2024 , 33 ( 3 ): 1974 .
LI H M , LIANG M X , NIU K , et al . A human-machine trust evaluation method for high-speed train drivers based on multi-modal physiological information [J ] . International Journal of Human-Computer Interaction , 2025 , 41 ( 4 ): 2659 - 2676 .
HUANG J , HUANG X Y , PENG Y , et al . Driver state recognition with physiological signals: Based on deep feature fusion and feature selection techniques [J ] . Biomedical Signal Processing and Control , 2024 , 93 : 106204 .
BARODI A , ZEMMOURI A , BAJIT A , et al . Intelligent transportation system based on smart soft-sensors to analyze road traffic and assist driver behavior applicable to smart cities [J ] . Microprocessors and Microsystems , 2023 , 100 : 104830 .
袁月婷 , 闫光辉 , 常文文 , 等 . 基于脑电信号空域特征的紧急制动行为识别 [J ] . 电子科技大学学报 , 2024 , 53 ( 1 ): 84 - 91 .
YUAN Y T , YAN G H , CHANG W W , et al . Emergency braking behavior recognition based on spatial features of EEG [J ] . Journal of University of Electronic Science and Technology of China , 2024 , 53 ( 1 ): 84 - 91 . (in Chinese)
赵朔 , 奇格奇 , 李培豪 , 等 . 基于脑电通道注意力机制的驾驶行为识别研究 [J ] . 交通运输系统工程与信息 , 2024 , 24 ( 4 ): 283 - 291 .
ZHAO S , QI G Q , LI P H , et al . Driving behavior recognition based on EEG channel attention mechanism [J ] . Journal of Transportation Systems Engineering and Information Technology , 2024 , 24 ( 4 ): 283 - 291 . (in Chinese)
吴建清 , 张子毅 , 王钰博 , 等 . 考虑多模态数据的重载货车危险驾驶行为识别方法 [J ] . 交通运输系统工程与信息 , 2024 , 24 ( 2 ): 63 - 75 .
WU J Q , ZHANG Z Y , WANG Y B , et al . Method for identifying dangerous driving behaviors in heavy-duty trucks based on multi-modal data [J ] . Journal of Transportation Systems Engineering and Information Technology , 2024 , 24 ( 2 ): 63 - 75 . (in Chinese)
TAO X M , GAO D C , ZHANG W Q , et al . A multimodal physiological dataset for driving behaviour analysis [J ] . Scientific Data , 2024 , 11 : 378 .
ZHOU S P , ZHANG N F , DUAN Q , et al . Monitoring and analyzing driver physiological states based on automotive electronic identification and multimodal biometric recognition methods [J ] . Algorithms , 2024 , 17 ( 12 ): 547 .
赵欣 , 吴建行 , 王坤 , 等 . 脑电信号伪迹去除算法综述 [J ] . 信号处理 , 2025 , 41 ( 6 ): 1015 - 1039 .
ZHAO X , WU J H , WANG K , et al . Removing artifacts from EEG signals: A review [J ] . Journal of Signal Processing , 2025 , 41 ( 6 ): 1015 - 1039 . (in Chinese)
董雪 , 许晓丹 , 谭静仪 , 等 . LF、HF与LF/HF在心率变异性分析中的应用与争议 [J ] . 生理科学进展 , 2023 , 54 ( 6 ): 509 - 516 .
DONG X , XU X D , TAN J Y , et al . Applications and controversies of LF, HF and LF/HF in heart rate variability analysis [J ] . Progress in Physiological Sciences , 2023 , 54 ( 6 ): 509 - 516 . (in Chinese)
徐进 , 陈正欢 , 廖祺硕 , 等 . 基于心电数据的高速公路高密度互通立交驾驶负荷 [J ] . 吉林大学学报(工学版) , 2024 , 54 ( 10 ): 2807 - 2818 .
XU J , CHEN Z H , LIAO Q S , et al . Mental workload of drivers at high-density interchanges of freeways based on ECG data [J ] . Journal of Jilin University (Engineering and Technology Edition) , 2024 , 54 ( 10 ): 2807 - 2818 . (in Chinese)
胡宏宇 , 周晓宇 , 张慧珺 , 等 . 考虑肌电信号的驾驶人弯道操纵行为分析 [J ] . 中国公路学报 , 2020 , 33 ( 6 ): 77 - 83 .
HU H Y , ZHOU X Y , ZHANG H J , et al . Electromyogram-based driver’s manipulation analysis during the curve road [J ] . China Journal of Highway and Transport , 2020 , 33 ( 6 ): 77 - 83 . (in Chinese)
SUN L C , YANG H Z , LI B . Multimodal dataset construction and validation for driving-related anger: A wearable physiological conduction and vehicle driving data approach [J ] . Electronics , 2024 , 13 ( 19 ): 3904 .
李鑫 , 陆伟 , 马召祎 , 等 . 基于图注意力和改进Transformer的节点分类方法 [J ] . 电子学报 , 2024 , 52 ( 8 ): 2799 - 2810 .
LI X , LU W , MA Z Y , et al . A node classification method based on graph attention and improved transformer [J ] . Acta Electronica Sinica , 2024 , 52 ( 8 ): 2799 - 2810 . (in Chinese)
YANG Y H , YUAN G L . High precision DSRC and LiDAR data integration positioning method for autonomous vehicles based on CNN [J ] . Computers and Electrical Engineering , 2024 , 120 : 109741 .
NAGABUSHANAM P , GEORGE S T , SUBATHRA M S P , et al . 1D-CNN architectures for EEG classification with motor imagery input of eyes open and eyes closed conditions [J ] . International Journal of Intelligence and Sustainable Computing , 2021 , 1 ( 3 ): 280 .
KEBBATI Y , AIT-OUFROUKH N , ICHALAL D , et al . RNN-based linear parameter varying adaptive model predictive control for autonomous driving [J ] . International Journal of Systems Science , 2025 , 56 ( 5 ): 996 - 1008 .
GAO J , YI J G , MURPHEY Y L . An efficient driving behavior prediction approach using physiological auxiliary and adaptive LSTM [J ] . Machine Vision and Applications , 2024 , 35 ( 5 ): 113 .
ZIAKOPOULOS A , SEKADAKIS M , KATRAKAZAS C , et al . Explainable macroscopic and microscopic influences of COVID-19 on naturalistic driver aggressiveness derived from telematics through SHAP values of SVM and XGBoost algorithms [J ] . Journal of Safety Research , 2025 , 92 : 393 - 407 .
CHEN J C , WANG H , HE E Q . A transfer learning-based CNN deep learning model for unfavorable driving state recognition [J ] . Cognitive Computation , 2024 , 16 ( 1 ): 121 - 130 .
金峥 , 贾克斌 . 一种基于Transformer架构的多层级自动睡眠分期模型 [J ] . 电子学报 , 2025 , 53 ( 2 ): 545 - 557 .
JIN Z , JIA K B . A hierarchical automatic sleep staging model based on transformer architecture [J ] . Acta Electronica Sinica , 2025 , 53 ( 2 ): 545 - 557 . (in Chinese)
邓院昌 , 蒋昀轩 , 陶胜芹 . 基于可解释集成学习的异常驾驶行为风险识别方法 [J ] . 交通运输系统工程与信息 , 2025 , 25 ( 2 ): 180 - 189 .
DENG Y C , JIANG Y X , TAO S Q . Risk identification method for abnormal driving behavior based on interpretable ensemble learning model [J ] . Journal of Transportation Systems Engineering and Information Technology , 2025 , 25 ( 2 ): 180 - 189 . (in Chinese)
YAN X , DUAN F R , CHEN L , et al . A multimodal MRI-based model for colorectal liver metastasis prediction: Integrating radiomics, deep learning, and clinical features with SHAP interpretation [J ] . Current Oncology , 2025 , 32 ( 8 ): 431 .
KONSTANTINOU T , HATZIARGYRIOU N . Complex terrains and wind power: Enhancing forecasting accuracy through CNNs and DeepSHAP analysis [J ] . Frontiers in Energy Research , 2024 , 11 : 1328899 .
0
Views
3
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621