电子学报 ›› 2021, Vol. 49 ›› Issue (3): 586-595.DOI: 10.12263/DZXB.20200623

• 学术论文 • 上一篇    下一篇

基于API短序列的勒索软件早期检测方法

陈长青, 郭春, 崔允贺, 申国伟, 蒋朝惠   

  1. 贵州省公共大数据重点实验室, 贵州大学计算机科学与技术学院, 贵州贵阳 550025
  • 收稿日期:2020-06-28 修回日期:2020-12-01 出版日期:2021-03-25 发布日期:2021-03-25
  • 通讯作者: 郭春
  • 作者简介:陈长青 男,1997年出生,硕士研究生,CCF学生会员,主要研究方向为计算机网络与信息安全.E-mail:ccq_study@163.com
  • 基金资助:
    国家自然科学基金(No.61802081);贵州省科学技术基金(黔科合基础[2017]1051,黔科合重大专项字[2018]3001);贵州省公共大数据重点实验室开放课题(No.2017BDKFJJ025)

Ransomware Early Detection Method Based on Short API Sequence

CHEN Chang-qing, CUO Chun, CUI Yun-he, SHEN Guo-wei, JIANG Chao-hui   

  1. Guizhou Provincial Key Laboratory of Public Big Data, School of Computer Science and Technology, Guizhou University, Guiyang, Guizhou 550025, China
  • Received:2020-06-28 Revised:2020-12-01 Online:2021-03-25 Published:2021-03-25

摘要: 传统的勒索软件动态检测方法需要收集较长时间的软件行为,难以满足勒索软件及时检测的需求.本文从勒索软件及时检测的角度出发,提出了"勒索软件检测关键时间段(Critical Time Periods for Ransomware Detection,CTP)"的概念,并基于CTP的要求提出了一种基于应用程序编程接口(Application Programming Interface,API)短序列的勒索软件早期检测方法(Ransomware Early Detection Method based on short API Sequence,REDMS).REDMS以软件在CTP内执行时所调用的API短序列为分析对象,通过n-gram模型和词频-逆文档频率算法对采集到的API短序列进行计算以生成特征向量,然后运用机器学习算法建立检测模型对勒索软件进行早期检测.实验结果显示,REDMS在API采集时段为前7s且使用随机森林算法时,分别能以98.2%、96.7%的准确率检测出已知和未知的勒索软件样本.

 

关键词: 勒索软件, 早期检测, 机器学习, 应用程序编程接口

Abstract: Traditional ransomware dynamic detection methods need to collect software behaviors for a long time,which is difficult to meet the need for timely detection of ransomware.From the perspective of the timely detection of ransomware,this article proposes a concept named "Critical Time Periods for Ransomware Detection (CTP)",and proposes an early ransomware detection method based on short application programming interface(API) sequence (REDMS) to fit the requirement of CTP.REDMS takes the short API sequences that are obtained by software running during the CTP as the analysis object,and calculates these short API sequences through the n-gram model and the term frequency-inverse document frequency algorithm to generate the feature vectors,and then uses a machine-learning algorithm to build a detection model for detecting ransomware.The experimental results show that when the first 7 seconds of API collection period and random forest algorithm are used,REDMS achieves 98.2% and 96.7% accuracy respectively for detecting the known and unknown ransomware samples.

Key words: ransomware, early detection, machine learning, API(application programming interface)

中图分类号: