电子学报 ›› 2022, Vol. 50 ›› Issue (1): 250-256.DOI: 10.12263/DZXB.20200619

• 科研通信 • 上一篇    下一篇

融合字符级滑动窗口和深度残差网络的僵尸网络DGA域名检测方法

刘小洋1, 刘加苗1, 刘超1, 张宜浩2   

  1. 1.重庆理工大学计算机科学与工程学院,重庆 400054
    2.重庆理工大学人工智能学院,重庆 401135
  • 收稿日期:2020-06-28 修回日期:2021-02-20 出版日期:2022-01-25 发布日期:2022-01-25
  • 作者简介:刘小洋 男,1980年出生,安徽安庆人.博士后.现为重庆理工大学计算机科学与工程学院教授、硕士生导师.主要从事社交网络分析、人工智能、网络安全与数据挖掘等方面的研究工作. E-mail:lxy3103@163.com
    刘加苗(通信作者) 男,1994年出生,重庆渝北人.现为重庆理工大学计算机科学与工程学院硕士研究生.主要从事网络安全、恶意流量检测与域名分析等方面的研究工作. E-mail:jiamiaoliu@126.com
  • 基金资助:
    国家社会科学基金(17XXW004)

Novel Botnet DGA Domain Detection Method Based on Character Level Sliding Window and Deep Residual Network

LIU Xiao-yang1, LIU Jia-miao1, LIU Chao1, ZHANG Yi-hao2   

  1. 1.School of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China
    2.School of Artificial Intelligence,Chongqing University of Technology,Chongqing 401135,China
  • Received:2020-06-28 Revised:2021-02-20 Online:2022-01-25 Published:2022-01-25

摘要:

本文提出了一种基于字符级滑动窗口的深度残差网络(Sliding Window-Depth Residual Network,SW-DRN),首次将轻量级深度可分离式卷积应用于僵尸网络中DGA(Domain Generation Algorithm)域名检测.SW-DRN采用深度可分离式卷积,相比标准卷积减少了约56%的参数,增强了模型检测效率.采集两种不同来源的数据,分别命名为Real-Dataset和Gen-Dataset.SW-DRN与对照组模型在两个数据集上进行实验,实验结果表明:SW-DRN模型在DGA域名二分类任务中的F-Score评估指标上分别取得了99.23%和97.81%的成绩;并且在少样本DGA域名家族以及域名字符串易混淆DGA域名情形下多分类任务中取得不错的成绩,相比目前已有的DGA域名分类模型在总体F-Score上提升了1.23%和1.01%的性能,增强了DGA域名家族之间的识别;同时还对所提出的模型在生成对抗模型产生域名进行测试,均能得到有效的识别.

关键词: 域名生成算法, 字符级向量, 残差网络, 深度可分离式卷积

Abstract:

This paper proposed a character-level sliding window based deep residual network model SW-DRN (Sliding Window-Depth Residual Network), which was the first to apply light depthwise separable convolution to the DGA(Domain Generation Algorithm) domain name detection. In SW-DRN, the use of depthwise separable convolution reduced the number of model parameters by about 56% compared with standard convolution, which enhanced the efficiency of model detection. Collect data from two different sources, named Real-Dataset and Gen-Dataset. Finally, comparison experiments on the dataset with the proposed DGA domain name detection model by previous researchers. Experimental results on two datasets show that the proposed SW-DRN model has achieved good results of 99.23% and 97.81% on the F-Score evaluation indicator in the DGA domain name binary classification task. Compared with the existing DGA domain name classification model, the SW-DRN has made a 1.23% and 1.01% performance improvement on the F-Score, enhancing the DGA domain name family recognition. At the same time, the proposed model tests in the generative adversarial networks to generate domain names, and it can be effectively identified.

Key words: domain generation algorithm, character-level vector, residual network, depthwise separable convolution

中图分类号: