基于API短序列的勒索软件早期检测方法

doi:10.12263/DZXB.20200623

PDF(1033 KB)

电子学报 ›› 2021, Vol. 49 ›› Issue (3) : 586-595. DOI: 10.12263/DZXB.20200623

学术论文

基于API短序列的勒索软件早期检测方法

陈长青, 郭春, 崔允贺, 申国伟, 蒋朝惠

作者信息 +

Ransomware Early Detection Method Based on Short API Sequence

CHEN Chang-qing, GUO Chun, CUI Yun-he, SHEN Guo-wei, JIANG Chao-hui

Author information +

文章历史 +

摘要

传统的勒索软件动态检测方法需要收集较长时间的软件行为，难以满足勒索软件及时检测的需求.本文从勒索软件及时检测的角度出发，提出了"勒索软件检测关键时间段（Critical Time Periods for Ransomware Detection，CTP）"的概念，并基于CTP的要求提出了一种基于应用程序编程接口（Application Programming Interface，API）短序列的勒索软件早期检测方法（Ransomware Early Detection Method based on short API Sequence，REDMS）.REDMS以软件在CTP内执行时所调用的API短序列为分析对象，通过n-gram模型和词频-逆文档频率算法对采集到的API短序列进行计算以生成特征向量，然后运用机器学习算法建立检测模型对勒索软件进行早期检测.实验结果显示，REDMS在API采集时段为前7s且使用随机森林算法时，分别能以98.2%、96.7%的准确率检测出已知和未知的勒索软件样本.

Abstract

Traditional ransomware dynamic detection methods need to collect software behaviors for a long time, which is difficult to meet the need for timely detection of ransomware. From the perspective of the timely detection of ransomware, this article proposes a concept named "Critical Time Periods for Ransomware Detection (CTP)", and proposes an early ransomware detection method based on short application programming interface (API) sequence (REDMS) to fit the requirement of CTP. REDMS takes the short API sequences that are obtained by software running during the CTP as the analysis object, and calculates these short API sequences through the n-gram model and the term frequency-inverse document frequency algorithm to generate the feature vectors, and then uses a machine-learning algorithm to build a detection model for detecting ransomware. The experimental results show that when the first 7 seconds of API collection period and random forest algorithm are used, REDMS achieves 98.2% and 96.7% accuracy respectively for detecting the known and unknown ransomware samples.

导出引用

陈长青, 郭春, 崔允贺, 申国伟, 蒋朝惠. 基于API短序列的勒索软件早期检测方法[J]. 电子学报, 2021, 49(3): 586-595. https://doi.org/10.12263/DZXB.20200623

CHEN Chang-qing, GUO Chun, CUI Yun-he, SHEN Guo-wei, JIANG Chao-hui. Ransomware Early Detection Method Based on Short API Sequence[J]. Acta Electronica Sinica, 2021, 49(3): 586-595. https://doi.org/10.12263/DZXB.20200623

中图分类号： TP309

参考文献

[1] Bridges L.The changing face of malware[J].Network Security,2008,2008(1):17-20.
[2] Symantec.2019 internet security threat report[J/OL].https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf,2020-06-28.
[3] Al-Rimy B A S,Maarof M A,Shaid S Z M.Ransomware threat success factors,taxonomy,and countermeasures:a survey and research directions[J].Computers & Security,2018,74(5):144-166.
[4] Kok S H,Abdullah A,Jhanjhi N Z,et al.Ransomware,threat and detection techniques:A review[J].International Journal Computer Science and Network Security,2019,19(2):136-146.
[5] Kok S H,Abdullah A,Jhanjhi N Z,et al.Prevention of crypto-ransomware using a pre-encryption detection algorithm[J].Computers,2019,8(4):79.
[6] Sgandurra D,Muñoz-González L,Mohsen R,et al.Automated dynamic analysis of ransomware:benefits,limitations and use for detection[J].Cryptography and Security,2016,9:03-20.
[7] San C C,Thwin M M S,Htun N L.Malicious software family classification using machine learning multi-class classifiers[J].Computational Science and Technology,2019,41:423-433.
[8] 任卓君,陈光,卢文科.基于N-gram特征的恶意代码可视化方法[J].电子学报,2019,47(10):2108-2115. REN Zhuo-jun,CHEN Guang,LU Wen-ke.Malware visualization methods based on N-gram features[J].Acta Electronica Sinica,2019,47(10):2108-2115.(in Chinese)
[9] Zhang W,Yoshida T,Tang X.A comparative study of TF-IDF,LSI and multi-words for text classification[J].Expert Systems with Applications,2011,38(3):2758-2765.
[10] 李鹏伟,姜宇谦,薛飞扬,黄佳佳,徐超.一种基于深度学习的强对抗性Android恶意代码检测方法[J].电子学报,2020,48(8):1502-1508. LI Peng-wei,JIANG Yu-qian,XUE Fei-yang,HUANG Jia-jia,XU Chao.Arobust approach for android malware detection based on deep learning[J].Acta Electronica Sinica,2020,48(8):1502-1508.(in Chinese)
[11] Zhang H,Xiao X,Mercaldo F,et al.Classification of ransomware families with machine learning based on N-gram of opcodes[J].Future Generation Computer Systems,2019,90:211-221.
[12] Harrington P.Machine Learning in Action[M].USA:Manning Publications Co,2012.122-136.
[13] 郭春,陈长青,申国伟,蒋朝惠.一种基于可视化的勒索软件分类方法[J].信息网络安全,2020,20(4):31-39. GUO Chun,CHEN Chang-qing,SHEN Guo-wei,et al.A ransomware classification method based on visualization[J].Netinfo Security,2020,20(4):31-39.(in Chinese)
[14] Sharma S,Singh S.Texture-Basedautomated classification of ransomware[J/OL].https://link.springer.com/article/10.1007/s40031-020-00499-w.2020-10-31.
[15] Choudhary S P,Vidyarthi M D.A simple method for detection of metamorphic malware using dynamic analysis and text mining[J].Procedia Computer Science,2015,54:265-270.
[16] Damodaran A,Troia FD,Visaggio CA,et al.A comparison of static,dynamic,and hybrid analysis for malware detection[J].Computer Virology and Hacking Techniques,2017,13(1):1-12.
[17] Hampton N,Baig Z,Zeadally S.Ransomware behavioural analysis on windows platforms[J].Journal of Information Security and Applications,2018,40:44-51.
[18] Kharaz A,Arshad S,Mulliner C,et al.UNVEIL:A large-scale,automated approach to detecting ransomware[A].25th USENIX Security Symposium[C].Austin,TX:Association,2016.757-772.
[19] Vinayakumar R,Soman K P,Velan K K,et al.Evaluating shallow and deep networks for ransomware detection and classification[A].International Conference on Advances in Computing,Communications and Informatics (ICACCI)[C].Croatia:IEEE,2017.259-265.
[20] Le Cun Y,Bengio Y,Hinton G.Deep learning[J].Nature,2015,521(7553):436-444.
[21] Feng Y,Liu C,Liu B.Poster:A new approach to detecting ransomware with deception[J/OL].http://173.236.186.201/TC/SP2017/poster-abstracts/IEEE-SP17_Posters_paper_26.pdf.2020-09-24.
[22] Scaife N,Carter H,Traynor P,et al.Cryptolock (and drop it):Stopping ransomware attacks on user data[A].IEEE 36th International Conference on Distributed Computing Systems (ICDCS)[C].Nara Japan:IEEE,2016.303-312.
[23] Morato D,Berrueta E,Magaña E,et al.Ransomware early detection by the analysis of file sharing traffic[J].Journal of Network and Computer Applications,2018,124:14-32.
[24] 吴玉佳,李晶,宋成芳,常军.基于高效用神经网络的文本分类方法[J].电子学报,2020,48(2):279-284. WU Yu-jia,LI Jing,SONG Cheng-fang,CHANG Jun.High utility neural networks for text classification[J].Acta Electronica Sinica,2020,48(2):279-284.(in Chinese)
[25] Singh K,Agrawal S.Comparative analysis of five machine learning algorithms for IP traffic classification[A].International Conference on Emerging Trends in Networks and Computer Communications (ETNCC)[C].Udaipur,India:IEEE,2011.33-38.