1.南京邮电大学计算机学院,江苏南京 210023
2.南京邮电大学高性能计算与大数据处理研究所,江苏南京 210023
3.国家高性能计算中心南京分中心,江苏南京 210023
4.江苏省高性能计算与智能处理工程研究中心,江苏南京 210023
刘尚东 男,1979年10月出生于甘肃省永靖县。现为南京邮电大学副教授、硕士生导师,主要方向为网络空间安全、人工智能、大数据等。 E-mail: lsd@njupt.edu.cn
杨易润 男,2001年12月出生于江苏省无锡市。现为南京邮电大学计算机学院、软件学院、网络空间安全学院硕士研究生,主要研究方向为APT检测与溯源。 E-mail: 1024041132@njuput.edu.cn
王一诺 女,2003年11月出生于江苏省连云港市。现为南京邮电大学计算机学院、软件学院、网络空间安全学院硕士研究生,主要研究方向为网络安全态势感知。 E-mail: 1025040932@njupt.edu.cn
杜宏煜 男,2000年11月出生于江苏省盐城市。现为南京邮电大学计算机学院、软件学院、网络空间安全学院博士研究生,主要研究方向为漏洞检测和网络空间威胁感知。E-mail: 2024040408@njupt.edu.cn
汪文博 男,1998年6月出生于江苏省苏州市。现为南京邮电大学计算机学院、软件学院、网络空间安全学院在读博士,主要研究方向为计算机网络安全和机器学习等。 E-mail: wangwenbo@njupt.edu.cn
季一木 男,1978年9月出生于安徽省无为市。现为南京邮电大学计算机学院教授,博士生导师,主要研究方向为人工智能、云计算和大数据安全等。 E-mail: iym@njupt.edu.cn
收稿:2026-03-29,
录用:2026-05-05,
网络首发:2026-06-15,
移动端阅览
刘尚东, 杨易润, 王一诺, 等. 基于层次化掩码图自编码框架的APT威胁检测方法[J/OL]. 电子学报, 2026,1-17.
LIU Shangdong, YANG Yirun, WANG Yinuo, et al. Advanced Persistent Threat Detection Via Hierarchical Masking Graph Autoencoder[J/OL]. ACTA ELECTRONICA SINICA, 2026, 1-17.
刘尚东, 杨易润, 王一诺, 等. 基于层次化掩码图自编码框架的APT威胁检测方法[J/OL]. 电子学报, 2026,1-17. DOI: 10.12263/DZXB.20251261.
LIU Shangdong, YANG Yirun, WANG Yinuo, et al. Advanced Persistent Threat Detection Via Hierarchical Masking Graph Autoencoder[J/OL]. ACTA ELECTRONICA SINICA, 2026, 1-17. DOI: 10.12263/DZXB.20251261.
高级持续性威胁(Advanced Persistent Threats, APTs)凭借其高度隐蔽性、长周期性及多阶段攻击的特性,已成为当前网络安全防御体系面临的最严峻挑战之一。尽管基于主机日志的溯源图分析技术能够将孤立的系统事件关联为细粒度的行为审计路径,为威胁检测提供了结构化支撑,但现有研究仍面临核心瓶颈:在复杂的系统环境中,攻击者往往通过低频操作来模拟良性行为,导致传统的基于特征码或静态规则的检测方案在应对零日攻击(Zero-day)时极易失效。针对上述挑战,本文提出一种层次化感知的图掩码自动编码器APT威胁检测框架。本框架的核心创新在于引入了层次化拓扑知识来指导掩码过程,而非采用盲目的随机遮蔽。具体而言,模型集成了全局感知遮蔽、局部感知遮蔽与元素感知遮蔽三种策略:全局感知遮蔽旨在保留溯源图的宏观结构稳定性,局部感知遮蔽侧重于刻画实体间的邻域交互逻辑,而元素感知遮蔽则关注实体属性的细粒度特征。这种层次化设计能够在预训练阶段有效过滤非结构性的系统噪声,同时最大程度地保留关键的因果逻辑链条。特别地,节点级一致性约束通过在原子尺度上建模,有效规避了传统图表征学习中全局聚合带来的信号稀释风险。这确保了即便在极端不平衡的样本分布下,微弱的攻击信号仍能通过损失函数获得充分的梯度响应,从而在数学逻辑上保障了训练目标与单点异常检测任务的一致性。在检测阶段,框架采用无监督异常检测算法,基于实体类型的嵌入分布量化节点异常分数,从而精准识别破坏局部因果链的恶意行为。本文在StreamSpot、Unicorn Wget以及DARPA E3等多个公开权威数据集上进行了全面评估。实验结果表明,该框架在平均精确率上达到了98.49%,F1分数达到98.97%。相比于现有基准模型,本文方法在极低攻击基本比率场景下表现出更强的鲁棒性与召回能力,能够有效识别APT攻击全生命周期中的微弱异常信号。
Advanced Persistent Threats (APTs) have emerged as one of the most severe challenges to modern cybersecurity defense systems due to their extreme stealthiness
prolonged duration
and multi-stage nature. Although provenance-based analysis of host logs provides structured support for threat detection by correlating isolated system events into granular behavioral auditing paths
existing research still faces a core bottleneck: in complex system environments
attackers often disguise malicious activities as benign behavior through low-frequency operations
rendering traditional detection schemes based on signatures or static rules highly susceptible to failure when encountering zero-day attacks. To address these challenges
this paper proposes a Hierarchical-Aware Graph Masked Autoencoder framework for APT detection. The primary innovation of this framework lies in the introduction of hierarchical topological knowledge to guide the masking process
fundamentally overcoming the limitations of blind random masking. Specifically
the model integrates three targeted strategies: Global-Aware Masking (GAM)
Local-Aware Masking (LAM)
and Element-Aware Masking (EAM). GAM aims to preserve the macro-structural stability of the provenance graph; LAM focuses on characterizing neighborhood interaction logic between entities; and EAM addresses fine-grained entity attributes. This hierarchical design effectively filters out non-structural system noise during the pre-training phase while maximizing the retention of critical causal logic chains. Notably
the node-level consistency constraint models at an atomic scale
effectively circumventing the risk of signal dilution caused by global aggregation in traditional graph representation learning. This ensures that even under extremely imbalanced sample distributions
faint attack signals can still obtain sufficient gradient responses through the loss function
thereby mathematically guaranteeing the logical alignment between training objectives and point-wise anomaly detection tasks. During the detection phase
the framework employs an unsupervised anomaly detection algorithm to quantify node anomaly scores based on the embedding distributions of entity types
enabling the precise identification of malicious behaviors that disrupt local causal chains. Comprehensive evaluations were conducted on multiple authoritative public datasets
including StreamSpot
Unicorn Wget
and DARPA E3. Experimental results demonstrate that the proposed framework achieves an average precision of 98.49% and an F1-score of 98.97%. Compared to state-of-the-art baselines
our method exhibits superior robustness and recall in scenarios with extremely low attack base rates
effectively identifying subtle anomalous signals throughout the entire APT lifecycle.
Aminu M , Akinsanya A , Oyedokun O , et al . A review of advanced cyber threat detection techniques in critical infrastructure: Evolution, current state, and future directions [J ] . Iconic Research and Engineering Journals , 2024 , 8 ( 2 ): 74 - 87 .
Nabi N , Rahman M M , Ghosh S K , et al . Machine learning-based anomaly detection for cyber threat prevention [J ] . Journal of Primeasia , 2025 , 6 ( 1 ): 1 - 8 . DOI: 10.25163/primeasia.6110172 http://dx.doi.org/10.25163/primeasia.6110172
Li Zhenyuan , Chen Q A , Yang Runqing , et al . Threat detection and investigation with system-level provenance graphs: A survey [J ] . Computers & Security , 2021 , 106 : 102282 . DOI: 10.1016/j.cose.2021.102282 http://dx.doi.org/10.1016/j.cose.2021.102282
Jia Zian , Xiong Yun , Yuhong Nan , et al . MAGIC: Detecting advanced persistent threats via masked graph representation learning [C ] // 33rd USENIX Security Symposium . Berkeley : USENIX Association , 2024 : 5197 - 5214 .
Bilot T , El Madhoun N , Al Agha K , et al . Graph neural networks for intrusion detection: A survey [J ] . IEEE Access , 2023 , 11 : 49114 - 49139 . DOI: 10.1109/access.2023.3275789 http://dx.doi.org/10.1109/access.2023.3275789
Anjum M M , Iqbal S , Hamelin B . ANUBIS: A provenance graph-based framework for advanced persistent threat detection [C ] // Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing . New York : ACM , 2022 : 1684 - 1693 . DOI: 10.1145/3477314.3507097 http://dx.doi.org/10.1145/3477314.3507097
Hossain M N , Milajerdi S M , Wang Junao , et al . SLEUTH: Real-time attack scenario reconstruction from COTS audit data [C ] // 26th USENIX Security Symposium . Berkeley : USENIX Association , 2017 : 487 - 504 .
Milajerdi S M , Eshete B , Gjomemo R , et al . POIROT: Aligning attack behavior with kernel audit records for cyber threat hunting [C ] // Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM , 2019 : 1795 - 1812 . DOI: 10.1145/3319535.3363217 http://dx.doi.org/10.1145/3319535.3363217
Milajerdi S M , Gjomemo R , Eshete B , et al . HOLMES: Real-time APT detection through correlation of suspicious information flows [C ] // 2019 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE , 2019 : 1137 - 1152 . DOI: 10.1109/sp.2019.00026 http://dx.doi.org/10.1109/sp.2019.00026
Liu Yushan , Zhang Mu , Li Ding , et al . Towards a timely causality analysis for enterprise security [C ] // 25th Annual Network and Distributed System Security Symposium . Reston : The Internet Society , 2018 . DOI: 10.14722/ndss.2018.23254 http://dx.doi.org/10.14722/ndss.2018.23254
Hassan W U , Guo Shengjian , Li Ding , et al . NoDoze: Combatting threat alert fatigue with automated provenance triage [C ] // 26th Annual Network and Distributed System Security Symposium . Reston : The Internet Society , 2019 . DOI: 10.14722/ndss.2019.23349 http://dx.doi.org/10.14722/ndss.2019.23349
Zhong Meihui , Lin Mingwei , Zhang Chao , et al . A survey on graph neural networks for intrusion detection systems: Methods, trends and challenges [J ] . Computers & Security , 2024 , 141 : 103821 . DOI: 10.1016/j.cose.2024.103821 http://dx.doi.org/10.1016/j.cose.2024.103821
Rehman M U , Ahmadi H , Hassan W U . Flash: A comprehensive approach to intrusion detection via provenance graph representation learning [C ] // 2024 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE , 2024 : 3552 - 3570 . DOI: 10.1109/sp54263.2024.00139 http://dx.doi.org/10.1109/sp54263.2024.00139
Cheng Zijun , Lv Qiujian , Liang Jinyuan , et al . Kairos: Practical intrusion detection and investigation using whole-system provenance [C ] // 2024 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE , 2024 : 3533 - 3551 . DOI: 10.1109/sp54263.2024.00005 http://dx.doi.org/10.1109/sp54263.2024.00005
Chen Tieming , Dong Chengyu , Lv Mingqi , et al . APT-KGL: An intelligent APT detection system based on threat knowledge and heterogeneous provenance graph learning [J ] . IEEE Transactions on Dependable and Secure Computing , 2022 .
Zengy J , Wang Xiang , Liu Jiahao , et al . SHADEWATCHER: Recommendation-guided cyber threat analysis using system audit records [C ] // 2022 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE , 2022 : 489 - 506 . DOI: 10.1109/sp46214.2022.9833669 http://dx.doi.org/10.1109/sp46214.2022.9833669
Wang Su , Wang Zhiliang , Zhou Tao , et al . THREATRACE: Detecting and tracing host-based threats in node level through provenance graph learning [J ] . IEEE Transactions on Information Forensics and Security , 2022 , 17 : 3972 - 3987 . DOI: 10.1109/tifs.2022.3208815 http://dx.doi.org/10.1109/tifs.2022.3208815
Kipf T N , Welling M . Semi-supervised classification with graph convolutional networks [C/OL ] // Proceedings of the 5th International Conference on Learning Representations , 2017 . https://researchr.org/publication/KipfW17 https://researchr.org/publication/KipfW17 .
Zhu Hongyin , Li Yakun , Liu Luyang , et al . RETRACTED: Pre-training graph autoencoder incorporating hierarchical topology knowledge [J ] . Expert Systems with Applications , 2025 , 265 : 125976 . DOI: 10.1016/j.eswa.2024.125976 http://dx.doi.org/10.1016/j.eswa.2024.125976
Pasquier T , Han Xueyuan , Goldstein M , et al . Practical whole-system provenance capture [C ] // Proceedings of the 2017 Symposium on Cloud Computing . New York : ACM , 2017 : 405 - 418 . DOI: 10.1145/3127479.3129249 http://dx.doi.org/10.1145/3127479.3129249
Keromytis A D . Transparent computing engagement 3 data release [EB/OL ] . ( 2018-09-01 ). https://github.com/darpa-i2o/Transparent-Computing https://github.com/darpa-i2o/Transparent-Computing .
The streamspot dataset [EB/OL ] . ( 2022-09-17 ). https://github.com/sbustreamspot/sbustreamspot-data https://github.com/sbustreamspot/sbustreamspot-data . DOI: 10.5753/sbceb.2024.1735 http://dx.doi.org/10.5753/sbceb.2024.1735
Wget dataset [EB/OL ] . ( 2022-09-17 ). https://dataverse.harvard.edu/dataverse/unicorn-wget https://dataverse.harvard.edu/dataverse/unicorn-wget . DOI: 10.17504/protocols.io.j8nlk85wdl5r/v1 http://dx.doi.org/10.17504/protocols.io.j8nlk85wdl5r/v1
Darpa transparent computing dataset [EB/OL ] . ( 2024-10-08 ). https://github.com/darpa-i2o/Transparent-Computing https://github.com/darpa-i2o/Transparent-Computing . DOI: 10.1117/12.2641011 http://dx.doi.org/10.1117/12.2641011
Manzoor E , Milajerdi S M , Akoglu L . Fast memory-efficient anomaly detection in streaming heterogeneous graphs [C ] // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York : ACM , 2016 : 1035 - 1044 . DOI: 10.1145/2939672.2939783 http://dx.doi.org/10.1145/2939672.2939783
Han Xueyuan , Pasquier T F J M , Bates A , et al . Unicorn: Runtime provenance-based detector for advanced persistent threats [C ] // 27th Annual Network and Distributed System Security Symposium . Reston : The Internet Society , 2020 . DOI: 10.14722/ndss.2020.24046 http://dx.doi.org/10.14722/ndss.2020.24046
郑锐 , 汪秋云 , 林卓庞 , 等 . 一种基于威胁情报层次特征集成的挖矿恶意软件检测方法 [J ] . 电子学报 , 2022 , 50 ( 11 ): 2707 - 2715 .
Zheng Rui , Wang Qiuyun , Lin Zhuopang , et al . Cryptojacking malware hunting: A method based on ensemble learning of hierarchical threat intelligence feature [J ] . Acta Electronica Sinica , 2022 , 50 ( 11 ): 2707 - 2715 . (in Chinese)
Hamilton W L , Ying R , Leskovec J . Inductive representation learning on large graphs [C ] // Proceedings of the 31st International Conference on Neural Information Processing Systems . New York : Curran Associates Inc. , 2017 : 1025 - 1035 . DOI: 10.7551/mitpress/11474.003.0014 http://dx.doi.org/10.7551/mitpress/11474.003.0014
Jiang Baoxiang , Bilot T , El Madhoun N , et al . ORTHRUS: Achieving high quality of attribution in provenance-based intrusion detection systems [C ] // 34th USENIX Security Symposium . Berkeley : USENIX Association , 2025 : 7173 - 7192 .
Paccagnella R , Datta P , Hassan W U , et al . Custos: Practical tamper-evident auditing of operating systems using trusted execution [C ] // 27th Annual Network and Distributed System Security Symposium . Reston : The Internet Society , 2020 . DOI: 10.14722/ndss.2020.24065 http://dx.doi.org/10.14722/ndss.2020.24065
冷涛 , 蔡利君 , 于爱民 , 等 . 基于系统溯源图的威胁发现与取证分析综述 [J ] . 通信学报 , 2022 , 43 ( 7 ): 172 - 188 .
Leng Tao , Cai Lijun , Yu Aimin , et al . Review of threat discovery and forensic analysis based on system provenance graph [J ] . Journal on Communications , 2022 , 43 ( 7 ): 172 - 188 . (in Chinese)
Tang Xianfeng , Li Yandong , Sun Yiwei , et al . Transferring robustness for graph neural network against poisoning attacks [C ] // Proceedings of the 13th International Conference on Web Search and Data Mining . New York : ACM , 2020 : 600 - 608 . DOI: 10.1145/3336191.3371851 http://dx.doi.org/10.1145/3336191.3371851
仇晶 , 陈荣融 , 朱浩瑾 , 等 . 基于溯源图的网络攻击调查研究综述 [J ] . 电子学报 , 2024 , 52 ( 7 ): 2529 - 2556 .
Qiu Jing , Chen Rongrong , Zhu Haojin , et al . A survey of network attack investigation based on provenance graph [J ] . Acta Electronica Sinica , 2024 , 52 ( 7 ): 2529 - 2556 . (in Chinese)
Veličković P , Cucurull G , Casanova A , et al . Graph attention networks [C/OL ] // Proceedings of the 6th International Conference on Learning Representations , 2018 . https://researchr.org/publication/VelickovicCCRLB18 https://researchr.org/publication/VelickovicCCRLB18 .
Mikolov T , Chen Kai , Corrado G , et al . Efficient estimation of word representations in vector space [PP/OL ] . V3. arXiv ( 2013-01-16 )[ 2026-03-29 ] . https://arxiv.org/abs/1301.3781 https://arxiv.org/abs/1301.3781 . DOI: 10.3126/jiee.v3i1.34327 http://dx.doi.org/10.3126/jiee.v3i1.34327
Welsh D J A , Powell M B . An upper bound for the chromatic number of a graph and its application to timetabling problems [J ] . The Computer Journal , 1967 , 10 ( 1 ): 85 - 86 . DOI: 10.1093/comjnl/10.1.85 http://dx.doi.org/10.1093/comjnl/10.1.85
Zhu Hongyin , Zeng Yi , Wang Dongsheng , et al . Brain knowledge graph analysis based on complex network theory [C ] // Proceedings of International Conference on Brain Informatics and Health . Cham : Springer , 2016 : 211 - 220 . DOI: 10.1007/978-3-319-47103-7_21 http://dx.doi.org/10.1007/978-3-319-47103-7_21
李忠 , 靳小龙 , 庄传志 , 等 . 面向图的异常检测研究综述 [J ] . 软件学报 , 2021 , 32 ( 1 ): 167 - 193 . DOI: 10.13328/j.cnki.jos.006100 http://dx.doi.org/10.13328/j.cnki.jos.006100
Li Zhong , Jin Xiaolong , Zhuang Chuanzhi , et al . Overview on graph based anomaly detection [J ] . Journal of Software , 2021 , 32 ( 1 ): 167 - 193 . (in Chinese) . DOI: 10.13328/j.cnki.jos.006100 http://dx.doi.org/10.13328/j.cnki.jos.006100
Yang Fan , Xu Jiacen , Xiong Chunlin , et al . PROGRAPHER: An anomaly detection system based on provenance graph embedding [C ] // 32nd USENIX Security Symposium . Berkeley : USENIX Association , 2023 : 4355 - 4372 . DOI: 10.1117/12.2683189 http://dx.doi.org/10.1117/12.2683189
Zhu Hongyin , Tiwari P , Zhang Yazhou , et al . SwitchNet: A modular neural network for adaptive relation extraction [J ] . Computers and Electrical Engineering , 2022 , 104 : 108445 . DOI: 10.1016/j.compeleceng.2022.108445 http://dx.doi.org/10.1016/j.compeleceng.2022.108445
Xu Keyulu , Hu Weihua , Leskovec J , et al . How powerful are graph neural networks [JC/OL ] . ] // Proceedings of the 7th International Conference on Learning RepresentationsarXiv preprint arXiv: 1810 . 00826 , 2018 . https://researchr.org/publication/XuHLJ19 https://researchr.org/publication/XuHLJ19 .
Hou Zhenyu , Liu Xiao , Cen Yukuo , et al . GraphMAE: Self-supervised masked graph autoencoders [C ] // Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . New York : ACM , 2022 : 594 - 604 . DOI: 10.1145/3534678.3539321 http://dx.doi.org/10.1145/3534678.3539321
Rumelhart D E , Hinton G E , Williams R J . Learning representations by back-propagating errors [J ] . Nature , 1986 , 323 ( 6088 ): 533 - 536 . DOI: 10.1038/323533a0 http://dx.doi.org/10.1038/323533a0
Shannon C E . A mathematical theory of communication [J ] . Bell System Technical Journal , 1948 , 27 ( 3 ): 379 - 423 . DOI: 10.1002/j.1538-7305.1948.tb01338.x http://dx.doi.org/10.1002/j.1538-7305.1948.tb01338.x
Park J , Lee M , Chang H J , et al . Symmetric graph convolutional autoencoder for unsupervised graph representation learning [C ] // Proceedings of the IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2019 : 6518 - 6527 . DOI: 10.48550/arXiv.1908.02441 http://dx.doi.org/10.48550/arXiv.1908.02441
Wang Xiao , Liu Nian , Han Hui , et al . Self-supervised heterogeneous graph neural network with co-contrastive learning [C ] // Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining . New York : ACM , 2021 : 1726 - 1736 . DOI: 10.1145/3447548.3467415 http://dx.doi.org/10.1145/3447548.3467415
Wu Wenhan , Hua Yilei , Zheng Ce , et al . Skeletonmae: Spatial-temporal masked autoencoders for self-supervised skeleton action recognition [C ] // 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) . Piscataway : IEEE , 2023 : 224 - 229 . DOI: 10.1109/icmew59549.2023.00045 http://dx.doi.org/10.1109/icmew59549.2023.00045
Papadimitriou S , Kitagawa H , Gibbons P B , et al . LOCI: Fast outlier detection using the local correlation integral [C ] // Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405) . Piscataway : IEEE , 2003 : 315 - 326 . DOI: 10.1109/icde.2003.1260776 http://dx.doi.org/10.1109/icde.2003.1260776
Van Der Maaten L , Hinton G . Visualizing data using t-SNE [J ] . Journal of Machine Learning Research , 2008 , 9 ( 86 ): 2579 - 2605 .
Vaswani A , Shazeer N , Parmar N , et al . Attention is all you need [C ] // Proceedings of the 31st International Conference on Neural Information Processing Systems . New York : Curran Associates Inc. , 2017 : 6000 - 6010 .
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621