1.北京理工大学计算机学院,北京 100081
2.北京师范大学人工智能学院,北京 100875
[ "刘天阳 女,1995年7月出生于河北省保定市.现为北京理工大学体系结构与高性能计算研究所博士研究生.主要研究领域为程序分析与优化.E-mail: lty@bit.edu.cn" ]
[ "石剑君 女,1991年2月出生于河南省邓州市.现为北京师范大学人工智能学院助理研究员.主要研究领域为程序分析与优化、系统软件安全.E-mail: shijianjun@bnu.edu.cn" ]
[ "叶嘉威 男,2001年5月出生于浙江省临海市.现为北京理工大学体系结构与高性能计算研究所硕士研究生.主要研究领域为程序分析与优化.E-mail: yejiawei@bit.edu.cn" ]
[ "计卫星 男,1980年2月出生于陕西省咸阳市.现为北京师范大学人工智能学院教授.主要研究领域为计算机体系结构、程序分析与优化、并行与高性能计算.中国电子学会会员编号:E190197518M.E-mail: jwx@bnu.edu.cn" ]
收稿:2025-09-22,
录用:2025-11-04,
纸质出版:2025-11-25
移动端阅览
刘天阳, 石剑君, 叶嘉威, 等. P-Slicer:面向路径表示学习的程序切片方法[J]. 电子学报, 2025, 53(11): 3894-3909.
LIU Tian-yang, SHI Jian-jun, YE Jia-wei, et al. P-Slicer: A Program Slicing Approach Based on Learning Path Representations[J]. Acta Electronica Sinica, 2025, 53(11): 3894-3909.
刘天阳, 石剑君, 叶嘉威, 等. P-Slicer:面向路径表示学习的程序切片方法[J]. 电子学报, 2025, 53(11): 3894-3909. DOI:10.12263/DZXB.20250824
LIU Tian-yang, SHI Jian-jun, YE Jia-wei, et al. P-Slicer: A Program Slicing Approach Based on Learning Path Representations[J]. Acta Electronica Sinica, 2025, 53(11): 3894-3909. DOI:10.12263/DZXB.20250824
程序切片技术作为软件分析中的基础性手段,在程序理解、缺陷定位、代码重构等任务中具有重要作用.其核心挑战在于如何在复杂控制流和数据流结构中准确识别与切片准则相关的代码片段.近年来,基于预训练大语言模型的切片方法因其对程序语义建模能力较强而展现出良好性能,然而受限于模型输入长度限制,难以有效处理长方法体及跨过程依赖等实际场景.针对以上问题,本文提出一种面向路径表示学习的程序切片方法P-Slicer.该方法首先通过构建基于语法结构的控制流图,从中提取多条可能的执行路径,以实现高代码覆盖率并保留上下文信息;随后,采用基于学习的分类模型对方法内部语句进行切片相关性判断;最后,结合变量的定义-使用传播机制,实现跨过程切片的递归分析.该方法在保持可扩展性的同时,融合了语义理解能力,提升了切片结果的准确性与实用性.实验结果表明,P-Slicer在切片任务中取得了95.95%的准确率、86.89%精确度和88.95%的召回率,且在处理长方法和跨过程切片时仍能保持良好性能,表明其在软件工程领域中的良好应用前景.
Program slicing is a foundational technique in software analysis
indispensable for tasks such as program understanding
defect localization
and code refactoring. Its primary challenge is to precisely identify code fragments related to a given slicing criterion within complex control and data flow structures. Recently
program slicing approaches based on pre-trained large language models have shown promising results
owing to their strong capability in capturing program semantics. However
due to the model’s limitation on input length
it is difficult to handle practical scenarios such as long methods and interprocedural dependencies. To address these problems
this paper proposes P-Slicer
a program slicing approach based on learning path representations. This approach first extracts multiple execution paths by building a control flow graph based on the syntactic structure to achieve high code coverage while preserving contextual information. Then
a learning-based classification model is employed to determine the relevance of each statement to the slice criterion. Finally
a variable define-use propagation mechanism for variables is employed to achieve interprocedural slices by recursive analysis. The approach integrates semantic comprehension while preserving the scalability
thereby enhancing the accuracy and practicality of the slicing results. The experimental results demonstrate that P-Slicer achieves 95.95% accuracy
86.89% precision
and 88.95% recall on slicing task
while maintaining robust performance when handling long methods and interprocedural slices
indicating its promising potential for application in the software engineering.
BINKLEY D W , GALLAGHER K B . Program slicing [M ] // Advances in Computers . Amsterdam : Elsevier , 1996 : 1 - 50 .
WEISER M . Program slicing [J ] . IEEE Transactions on Software Engineering , 1984 , SE-10( 4 ): 352 - 357 .
XU B W , QIAN J , ZHANG X F , et al . A brief survey of program slicing [J ] . ACM SIGSOFT Software Engineering Notes , 2005 , 30 ( 2 ): 1 - 36 .
WEISER M . Programmers use slices when debugging [J ] . Communications of the ACM , 1982 , 25 ( 7 ): 446 - 452 .
WOTAWA F . On the relationship between model-based debugging and program slicing [J ] . Artificial Intelligence , 2002 , 135 ( 1/2 ): 125 - 143 .
BADIHI S , NOURJI S , RUBIN J . Slicer4D: A slicing-based debugger for Java [C ] // Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering . New York : ACM , 2024 : 2407 - 2410 .
段旭 , 吴敬征 , 罗天悦 , 等 . 基于代码属性图及注意力双向LSTM的漏洞挖掘方法 [J ] . 软件学报 , 2020 , 31 ( 11 ): 3404 - 3420 .
DUAN X , WU J Z , LUO T Y , et al . Vulnerability mining method based on code property graph and attention BiLSTM [J ] . Journal of Software , 2020 , 31 ( 11 ): 3404 - 3420 . (in Chinese)
LI Z , ZOU D Q , XU S H , et al . VulDeePecker: A deep learning-based system for vulnerability detection [EB/OL ] . ( 2018-01-05 )[ 2025-09-30 ] . https://arXiv.org/abs/1801.01681 https://arXiv.org/abs/1801.01681 .
ZOU D Q , WANG S J , XU S H , et al . <math id="M25"><mi>μ</mi></math> https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=100255208&type= https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=100255229&type= 1.77800000 2.96333337 VulDeePecker: A deep learning-based system for multiclass vulnerability detection [J ] . IEEE Transactions on Dependable and Secure Computing , 2021 , 18 ( 5 ): 2224 - 2236 .
TANG C H , SHUAI H L , YANG M M . Program defect detection using sensitive slice semantics with control flow variable dependency [J ] . Recent Advances in Computer Science and Communications , 2025 , 18 ( 4 ): 18
AZIM T , ALAVI A , NEAMTIU I , et al . Dynamic slicing for Android [C ] // 2019 IEEE/ACM 41st International Conference on Software Engineering . Piscataway : IEEE , 2019 : 1154 - 1164 .
GRAHAM S L , HORWITZ S , REPS T , et al . Interprocedural slicing using dependence graphs [J ] . ACM Transactions on Programming Languages and Systems , 1990 , 12 ( 1 ): 26 - 60 .
ZHANG Y Z , XU B W , GAYO J E L . A formal method for program slicing [C ] // Proceedings of the 2005 Australian conference on Software Engineering . New York : ACM , 2005 : 140 - 148 .
GALLAGHER K B , KOZAITIS S J . Program slicing: A brief retrospective [J ] . IEEE Transactions on Software Engineering , 2025 , 51 ( 3 ): 720 - 724 .
CHEN J B , XIANG H J , ZHAO Z H , et al . Utilizing precise and complete code context to guide LLM in automatic false positive mitigation [EB/OL ] . ( 2025-05-31 )[ 2025-09-30 ] . https://arXiv.org/abs/2411.03079 https://arXiv.org/abs/2411.03079 .
HE J D , TREUDE C , LO D . LLM-based multi-agent systems for software engineering: Literature review, vision, and the road ahead [J ] . ACM Transactions on Software Engineering and Methodology , 2025 , 34 ( 5 ): 1 - 30 .
MICHELUTTI C , ECKERT J , MONECKE M , et al . A systematic study on the potentials and limitations of LLM-assisted software development [C ] // 2024 2nd International Conference on Foundation and Large Language Models . Piscataway : IEEE , 2025 : 330 - 338 .
NAM D , MACVEAN A , HELLENDOORN V , et al . Using an LLM to help with code understanding [C ] // Proceedings of the IEEE/ACM 46th International Conference on Software Engineering . New York : ACM , 2024 : 1 - 13 .
YADAVALLY A , LI Y , WANG S H , et al . A learning-based approach to static program slicing [J ] . Proceedings of the ACM on Programming Languages , 2024 , 8 (OOPSLA 1 ): 83 - 109 .
YADAVALLY A , LI Y , NGUYEN T N . Predictive program slicing via execution knowledge-guided dynamic dependence learning [J ] . Proceedings of the ACM on Software Engineering , 2024 , 1(FSE): 271- 292 .
SHAHANDASHTI K K , MOHAJER M M , BELLE A B , et al . Program slicing in the era of large language models [EB/OL ] . ( 2024-09-19 )[ 2025-09-30 ] . https://arXiv.org/abs/2409.12369 https://arXiv.org/abs/2409.12369 .
CHENG X , WANG H Y , HUA J Y , et al . DeepWukong: Statically detecting software vulnerabilities using deep graph neural network [J ] . ACM Transactions on Software Engineering and Methodology , 2021 , 30 ( 3 ): 1 - 33 .
OTTENSTEIN K J , OTTENSTEIN L M . The program dependence graph in a software development environment [J ] . ACM SIGSOFT Software Engineering Notes , 1984 , 9 ( 3 ): 177 - 184 .
ORSO A , SINHA S , HARROLD M J . Incremental slicing based on data-dependences types [C ] // Proceedings IEEE International Conference on Software Maintenance. ICSM 2001 . Piscataway : IEEE , 2002 : 158 - 167 .
KRINKE J . Barrier slicing and chopping [C ] // Proceedings Third IEEE International Workshop on Source Code Analysis and Manipulation . Piscataway : IEEE , 2003 : 81 - 87 .
GIFFHORN D , HAMMER C . An evaluation of slicing algorithms for concurrent programs [C ] // Seventh IEEE International Working Conference on Source Code Analysis and Manipulation . Piscataway : IEEE , 2007 : 17 - 26 .
BINKLEY D , GOLD N , HARMAN M , et al . ORBS: Language-independent program slicing [C ] // Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering . New York : ACM , 2014 : 109 - 120 .
BINKLEY D , GOLD N , HARMAN M , et al . ORBS and the limits of static slicing [C ] // 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation . Piscataway : IEEE , 2015 : 1 - 10 .
GOLD N E , BINKLEY D , HARMAN M , et al . Generalized observational slicing for tree-represented modelling languages [C ] // Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering . New York : ACM , 2017 : 547 - 558 .
BINKLEY D , GOLD N , ISLAM S , et al . Tree-oriented vs. line-oriented observation-based slicing [C ] // 2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation . Piscataway : IEEE , 2017 : 21 - 30 .
LEE S . Scalable and approximate program dependence analysis [C ] // Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings . New York : ACM , 2020 : 162 - 165 .
LIU P F , YUAN W Z , FU J L , et al . Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing [J ] . ACM Computing Surveys , 2023 , 55 ( 9 ): 1 - 35 .
FENG Z Y , GUO D Y , TANG D Y , et al . CodeBERT: A pre-trained model for programming and natural languages [EB/OL ] . ( 2020-09-18 )[ 2025-09-30 ] . https://arXiv.org/abs/2002.08155 https://arXiv.org/abs/2002.08155 .
NIU C G , LI C Y , NG V , et al . SPT-code: Sequence-to-sequence pre-training for learning source code representations [C ] // 2022 IEEE/ACM 44th International Conference on Software Engineering . Piscataway : IEEE , 2022 : 1 - 13 .
GUO D Y , REN S , LU S , et al . GraphCodeBERT: Pre-training code representations with data flow [EB/OL ] . ( 2021-09-13 )[ 2025-09-30 ] . https://arXiv.org/abs/2009.08366 https://arXiv.org/abs/2009.08366 .
LI Z Y , LU S , GUO D Y , et al . Automating code review activities by large-scale pre-training [C ] // Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering . New York : ACM , 2022 : 1035 - 1047 .
GUO D Y , LU S , DUAN N , et al . UniXcoder: Unified cross-modal pre-training for code representation [EB/OL ] . ( 2022-03-08 )[ 2025-09-29 ] . https://arXiv.org/abs/2203.03850 https://arXiv.org/abs/2203.03850 .
曹鹤玲 , 刘昱 , 韩栋 . 基于自注意力机制神经机器翻译的软件缺陷自动修复方法 [J ] . 电子学报 , 2024 , 52 ( 3 ): 945 - 956 .
CAO H L , LIU Y , HAN D . Self-attention neural machine translation for automatic software repair [J ] . Acta Electronica Sinica , 2024 , 52 ( 3 ): 945 - 956 . (in Chinese)
WANG Z L , LI G , LI J , et al . Line-level semantic structure learning for code vulnerability detection [EB/OL ] . ( 2024-11-08 )[ 2025-09-27 ] . https://arXiv.org/abs/2407.18877 https://arXiv.org/abs/2407.18877 .
AHMAD W U , CHAKRABORTY S , RAY B , et al . Unified pre-training for program understanding and generation [EB/OL ] . ( 2021-04-10 )[ 2025-09-29 ] . https://arXiv.org/abs/2103.06333 https://arXiv.org/abs/2103.06333 .
WANG Y , WANG W S , JOTY S , et al . CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation [EB/OL ] . ( 2021-09-02 )[ 2025-09-28 ] . https://arXiv.org/abs/2109.00859 https://arXiv.org/abs/2109.00859 .
ZHANG Q L , CHEN Q , LI Y L , et al . Sequence model with self-adaptive sliding window for efficient spoken document segmentation [C ] // 2021 IEEE Automatic Speech Recognition and Understanding Workshop . Piscataway : IEEE , 2022 : 411 - 418 .
FU Z C , SONG W T , WANG Y J , et al . Sliding window attention training for efficient large language models [EB/OL ] . ( 2025-06-04 )[ 2025-09-30 ] . https://arXiv.org/abs/2502.18845 https://arXiv.org/abs/2502.18845 .
JASZCZUR S , CHOWDHERY A , MOHIUDDIN A , et al . Sparse is enough in scaling transformers [EB/OL ] . ( 2021-11-24 )[ 2025-09-30 ] . https://arXiv.org/abs/2111.12763 https://arXiv.org/abs/2111.12763 .
ZHANG J W , LIU Z X , HU X , et al . Vulnerability detection by learning from syntax-based execution paths of code [J ] . IEEE Transactions on Software Engineering , 2023 , 49 ( 8 ): 4196 - 4212 .
GALINDO C , PEREZ S , SILVA J . A program slicer forJava (tool paper) [C ] // Software Engineering and Formal Methods . Cham : Springer , 2022 : 146 - 151 .
0
浏览量
4
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621