Malware Classification Method Based on Improved CNN

XUAN Bo-na; LI Jin

doi:10.12263/DZXB.20220818

您当前的位置：

首页 >

文章列表页 >

Malware Classification Method Based on Improved CNN

PAPERS | 更新时间：2025-12-08

- Malware Classification Method Based on Improved CNN
- ACTA ELECTRONICA SINICA Vol. 51, Issue 5, Pages: 1187-1197(2023)
- 作者机构：
  
  空军工程大学防空反导学院，陕西西安 710051
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(61806219;61703426;61876189)
- DOI：10.12263/DZXB.20220818
  CLC： TP309.5
- Received：14 July 2022，
  
  Revised：2022-11-02，
  
  Published：25 May 2023
- 稿件说明：
移动端阅览
轩勃娜,李进.基于改进CNN的恶意软件分类方法[J].电子学报,2023,51(05):1187-1197.

XUAN Bo-na,LI Jin.Malware Classification Method Based on Improved CNN[J].ACTA ELECTRONICA SINICA,2023,51(05):1187-1197.
轩勃娜,李进.基于改进CNN的恶意软件分类方法[J].电子学报,2023,51(05):1187-1197. DOI： 10.12263/DZXB.20220818.

XUAN Bo-na,LI Jin.Malware Classification Method Based on Improved CNN[J].ACTA ELECTRONICA SINICA,2023,51(05):1187-1197. DOI： 10.12263/DZXB.20220818.

摘要

越来越多的恶意软件变种给网络安全带来了巨大的威胁，导致了现有基于CNN（Convolutional Neural Networks）的恶意软件分类方法的泛化能力弱和准确性不足.为了解决这些问题，本文提出了一种新的方法，即基于改进CNN的恶意软件RGB（Red Green Blue）可视化的分类方法，可以抵御变种和混淆性恶意软件. 首先，提出了一种基于RGB图像的特征表示方法，该方法更加关注恶意软件的二进制和汇编信息、API信息间的语义关系，生成具有更丰富纹理信息的图像，可以挖掘恶意代码原始与变种之间更深层的依赖关系.其次，针对恶意软件的加密和混淆问题，使用坐标注意力模块（Coordinate Attention Module，CAM）获取更大范围的空间信息来强化特征.最后，结合空洞空间金字塔池化（Atrous Spatial Pyramid Pooling，ASPP）来改进CNN模型，解决因图像尺寸归一化导致的信息丢失和冗余.实验结果表明，上述方法在最近的先进方法中脱颖而出，对Kaggle数据集和DataCon数据集的准确率分别达到99.48%和97.78%.与其它方法相比，该方法对Kaggle数据集的准确率提高了0.22%，对DataCon数据集的准确率提高了0.80%.本文方法可以有效地分类恶意软件和恶意软件家族变种，具有良好的泛化能力和抗混淆能力.

Abstract

The increasing variants malware bring a great threat to network security

leading to weak generalization and insufficient accuracy of existing base on the convolutional neural networks (CNN) malware classification methods. To solve these problems

an approach

namely

a classification method based on improved the CNN for malware RGB (Red Green Blue) visualization that can resist variants and obfuscation malware. Firstly

our method proposed a feature representation method based on RGB image

which pays more attention to the semantic relationship between binary

assembly information and API information of malware. The generated image

with richer vein information

that can uncover deeper dependencies between the original and variants of the malware. Secondly

to address the problems of malware encryption and obfuscation

this paper uses the coordinate attention module (CAM) to obtain a larger range of the spatial information to strengthen malware features. Finally

the Atrous spatial pyramid pooling (ASPP) is combined to improve the CNN model to address the information loss and redundancy due to image size normalization. The experimental results show that the above methods stands out among the recent advanced methods with an accuracy of 99.48% and 97.78% for dataset Kaggle and dataset DataCon. Compared with the other methods

our method had the accuracy increased by 0.22% for dataset Kaggle

and had the accuracy increased by 0.80% for dataset DataCon. Our method can effectively classify malware and variants of malware families

which has excellent generalization ability and anti-obfuscation ability.

关键词

Keywords

references

MORGAN , Top S . 5 cybersecurityfacts, figures and statistics for 2018 [R/OL]. [ 2018-05-05 ]. https://www.csoonline.com/article/3153707/security/top-5-cybersecurity-facts-figures-and-statistics.html https://www.csoonline.com/article/3153707/security/top-5-cybersecurity-facts-figures-and-statistics.html .

Enterprise Symantec . 2018 . Internet Security Threat Report 2018 [R/OL]. [ 2019-06-15 ]. https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-2018-en.pdf https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-2018-en.pdf .

KHOSHBARFOROUSHHA A , RANJAN R , GAIRE R , et al . Distribution based workload modelling of continuous queries in clouds [J]. IEEE Transactions on Emerging Topics in Computing , 2016 , 5 ( 1 ): 120 - 133 .

TSOCHEV G , TRIFONOV R , NAKOV O , et al . Cyber security: Threats and challenges [C]// 2020 International Conference Automatics and Informatics(ICAI) . Varna : IEEE , 2020 : 1 - 6 .

NATARAJ L , YEGNESWARAN V , PORRAS P , et al . A comparative assessment of malware classification using binary texture analysis and dynamic analysis [C]// Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence . Chicago : ACM , 2011 : 21 - 30 .

NATARAJ L , KARTHIKEYAN S , JACOB G , et al . Malware images: visualization and automatic classification [C]// Proceedings of the 8th International Symposium on Visualization for Cyber Security . New York : ACM , 2011 : 1 - 7 .

SHAID S Z M , MAAROF M A . Malware behavior image for malware variant identification [C]// 2014 International Symposium on Biometrics and Security Technologies(ISBAST) . Kuala Lumpur, Malaysia : IEEE , 2014 : 238 - 243 .

HAN K S , LIM J H , KANG B , et al . Malware analysis using visualized images and entropy graphs [J]. International Journal of Information Security , 2015 , 14 ( 1 ): 1 - 14 .

CUI Z , XUE F , CAI X , et al . Detection of malicious code variants based on deep learning [J]. IEEE Transactions on Industrial Informatics , 2018 , 14 ( 7 ): 3187 - 3196 .

FU J , XUE J , WANG Y , et al . Malware visualization for fine-grained classification [J]. IEEE Access , 2018 , 6 : 14510 - 14523 .

LE Q , BOYDELL O , NAMEE B MAC , et al . Deep learning at the shallow end: Malware classification for non-domain experts [J]. Digital Investigation , 2018 , 26 : S118 - S126 .

VU D L , NGUYEN T K , NGUYEN T V , et al . A convolutional transformation network for malware classification [C]// 2019 6th NAFOSTED Conference on Information and Computer Science (NICS) . Hanoi, Vietnam : IEEE , 2019 : 234 - 239 .

GIBERT D , MATEU C , PLANES J , et al . Using convolutional neural networks for classification of malware represented as images [J]. Journal of Computer Virology and Hacking Techniques , 2019 , 15 ( 1 ): 15 - 28 .

GIBERT D , MATEU C , PLANES J . Orthrus: A bimodal learning architecture for malware classification [C]// 2020 International Joint Conference on Neural Networks(IJCNN) . Glasgow, UK : IEEE , 2020 : 1 - 8 .

YUAN B , WANG J , LIU D , et al . Byte-level malware classification based on Markov images and deep learning [J]. Computers & Security , 2020 , 92 : 101740 .

QIAN Y , JIANG Q , JIANG Z , et al . A multi-channel visualization method for malware classification based on deep learning [C]// 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering , TrustCom/BigDataSE. Rotorua , New Zealand : IEEE , 2019 : 757 - 762 .

LI Q , MI J , LI W , et al . CNN-based malware variants detection method for internet of things [J]. IEEE Internet of Things Journal , 2021 , 8 ( 23 ): 16946 - 16962 .

AMER E , ZELINKA I . A dynamic Windows malware detection and prediction method based on contextual understanding of API call sequence [J]. Computers & Security , 2020 , 92 : 101760 .

FUCHS F , WORRALL D , FISCHER V , et al . SE(3)-transformers: 3D roto-translation equivariant attention networks [J]. Advances in Neural Information Processing Systems , 2020 , 33 : 1970 - 198

WOO S , PARK J , LEE J Y , et al . Cbam: Convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision (ECCV) . Cham : Springer . 2018 : 3 - 19 .

CHENG S , WANG L , DU A . Asymmetric coordinate attention spectral-spatial feature fusion network for hyperspectral image classification [J]. Scientific Reports , 2021 , 11 ( 1 ): 1 - 17 .

KIM J H , ON K W , LIM W , et al . Hadamard product for low-rank bilinear pooling [C]// The 5th International Conference on Learning Representations (ICLR) . New York : ACM , 2018 : 1 - 7 .

TAN M , LE Q . Efficientnetv2: Smaller models and faster training [C]// 2019 International Conference on Machine Learning(ICML) . Cham : Springer , 2021 : 10096 - 10106 .

HE K , ZHANG X , REN S , et al . Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 , 37 ( 9 ): 1904 - 1916 .

CHEN L C , ZHU Y , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// Proceedings of the European Conference on Computer Vision (ECCV) . Cham : Springer . 2018 : 801 - 818 .

RONEN R , RADU M , FEUERSTEIN C , et al . Microsoft Malware Classification Challenge 2018 [EB/OL]. [ 2019-05-29 ]. https://doi.org/10.48550/1802/10135 https://doi.org/10.48550/1802/10135 .

Qian Xin Technology Research Institute . DataCon: Multi-domain large-scale competition open data for security research [EB/OL]. [ 2020-08-25 ]. https://DataCon. qianxin.com/opendata https://DataCon.qianxin.com/opendata .

HE K , ZHANG X , REN S , et al . Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas : IEEE , 2016 : 770 - 778 .

HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu : IEEE , 2017 : 4700 - 4708 .

CHOLLET F . Xception: Deep learning with depthwise separable convolutions [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu : IEEE , 2017 : 1251 - 1258 .

杨望 , 高明哲 , 蒋婷 . 一种基于多特征集成学习的恶意代码静态检测框架 [J]. 计算机研究与发展 , 2021 , 58 ( 05 ): 1021 - 1034 .

YANG W , GAO M Z , JIANG T . A static detection framework of malware based on multi feature ensemble learning [J]. Journal of Computer Research and Development , 2021 , 58 ( 05 ): 1021 - 1034 . (in Chinese)

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

A Survey of Zero Trust Management for Large-Scale Internet of Things

Research on Modeling of Latent Virus Propagation Behavior with Increasing or Decreasing Network Nodes

A Malware Classification Method Based on MP-FSCIL

Related Author

XING Fang-yuan

DONG Ao

SUN Yu-yi

TONG Fei

HE Shi-bo

CHENG Guang

WANG Gang

LU Shi-wei

Related Institution

School of Cyber Science and Engineering, Southeast University

School of Information Science and Technology, Hangzhou Normal University

College of Control Science and Engineering, Zhejiang University

Information and Navigation Institute, Air Force Engineering University

Institute of Computing Technology, Guangzhou University

⁰