Research on the Generation of Adversarial Samples Based on the Attribution of Principal Features

WANG Shuo; XU Ru-zhi; GUAN Zhi-tao

doi:10.12263/DZXB.20230383

您当前的位置：

首页 >

文章列表页 >

Research on the Generation of Adversarial Samples Based on the Attribution of Principal Features

PAPERS | 更新时间：2025-12-08

- Research on the Generation of Adversarial Samples Based on the Attribution of Principal Features
- ACTA ELECTRONICA SINICA Vol. 51, Issue 11, Pages: 3137-3145(2023)
- 作者机构：
  
  华北电力大学控制与计算机工程学院，北京 102206
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(61972148)
- DOI：10.12263/DZXB.20230383
  CLC： TP391.4
- Received：28 April 2023，
  
  Revised：2023-09-21，
  
  Published：25 November 2023
- 稿件说明：
移动端阅览
王硕,徐茹枝,关志涛.基于主特征归因的对抗样本生成方法研究[J].电子学报,2023,51(11):3137-3145.

WANG Shuo,XU Ru-zhi,GUAN Zhi-tao.Research on the Generation of Adversarial Samples Based on the Attribution of Principal Features[J].ACTA ELECTRONICA SINICA,2023,51(11):3137-3145.
王硕,徐茹枝,关志涛.基于主特征归因的对抗样本生成方法研究[J].电子学报,2023,51(11):3137-3145. DOI： 10.12263/DZXB.20230383.

WANG Shuo,XU Ru-zhi,GUAN Zhi-tao.Research on the Generation of Adversarial Samples Based on the Attribution of Principal Features[J].ACTA ELECTRONICA SINICA,2023,51(11):3137-3145. DOI： 10.12263/DZXB.20230383.

摘要

为提高对抗训练的样本质量，本文对深度学习模型内部的特征识别过程进行了探究，并提出了一种基于主特征归因的迁移性对抗样本生成方法.算法在提取样本的主要特征后对目标层神经元进行特征归因，并利用独立性假设简化梯度计算，通过抑制积极神经元的识别作用，更加高效地得到更具迁移性的对抗样本.经过大量实验验证，相比于已有方法，在针对多模型的攻击中，本文算法的攻击成功率提高了5%以上，为后续研究如何提高模型的鲁棒性奠定了基础.

Abstract

In order to improve the quality of adversarial training samples

this article explores the feature recognition process within deep learning models and proposes a transferability adversarial sample generation method based on main feature attribution. After extracting the main features of the samples

the algorithm attributes the features of the target layer neurons

and simplifies the gradient calculation by using the independence assumption. By inhibiting the recognition of active neurons

the algorithm can more efficiently obtain more transferable adversarial samples. Through a large number of experiments

compared to existing methods

the success rate of our algorithm in attacks against multiple models has been improved by at least 5%

it lays a foundation for further research on how to improve the robustness of the model.

关键词

Keywords

references

纪守领 , 杜天宇 , 邓水光 , 等 . 深度学习模型鲁棒性研究综述 [J ] . 计算机学报 , 2022 , 45 ( 1 ): 190 - 206 .

JI S L , DU T Y , DENG S G , et al . Robustness certification research on deep learning models: A survey [J ] . Chinese Journal of Computers , 2022 , 45 ( 1 ): 190 - 206 . (in Chinese)

张思思 , 左信 , 刘建伟 . 深度学习中的对抗样本问题 [J ] . 计算机学报 , 2019 , 42 ( 8 ): 1886 - 1904 .

ZHANG S S , ZUO X , LIU J W . The problem of the adversarial examples in deep learning [J ] . Chinese Journal of Computers , 2019 , 42 ( 8 ): 1886 - 1904 . (in Chinese)

李盼 , 赵文涛 , 刘强 , 等 . 机器学习安全性问题及其防御技术研究综述 [J ] . 计算机科学与探索 , 2018 , 12 ( 2 ): 171 - 184 .

LI P , ZHAO W T , LIU Q , et al . Security issues and their countermeasuring techniques of machine learning: A survey [J ] . Journal of Frontiers of Computer Science and Technology , 2018 , 12 ( 2 ): 171 - 184 . (in Chinese)

GOODFELLOW I , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples [EB/OL ] . ( 2014-12-19 )[ 2023-09-27 ] . https://arxiv.org/pdf/1412.6572.pdf https://arxiv.org/pdf/1412.6572.pdf .

KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial examples in the physical world [M ] //First Edition. Artificial Intelligence Safety and Security . Boca Raton : CRC Press/Taylor & Francis Group , 2018 : 99 - 112 .

DONG Y P , LIAO F Z , PANG T Y , et al . Boosting adversarial attacks with momentum [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 9185 - 9193 .

XIE C H , ZHANG Z S , ZHOU Y Y , et al . Improving transferability of adversarial examples with input diversity [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 2725 - 2734 .

MADRY A , MAKELOV A , L SCHMIDT , et al . Towards deep learning models resistant to adversarial attacks [EB/OL ] . ( 2014-12-19 )[ 2023-09-27 ] . https://arxiv.org/pdf/1412. 6572.pdf https://arxiv.org/pdf/1412.6572.pdf .

SZEGEDY C , ZAREMBA W , SUTSKEVER I , et al . Intriguing properties of neural networks [C/OL ] //The 2nd International Conference on Learning Representations. Washington DC: ICLR, 2014 . [2023-09-27] . https://www.mendeley.com/catalogue/76b4e01d-6cc1-31dd-a3a5-df1354e1cf8d/ https://www.mendeley.com/catalogue/76b4e01d-6cc1-31dd-a3a5-df1354e1cf8d/ .

CARLINI N , WAGNER D . Towards evaluating the robustness of neural networks [C ] // 2017 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE , 2017 : 39 - 57 .

LIU X Z , LI L , WANG X Y , et al . Adversarial examples generated from sample subspace [J ] . Computer Standards & Interfaces , 2022 , 82 : 103634 .

XU Q L , TAO G H , ZHANG X Y . Bounded adversarial attack on deep content features [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 15182 - 15191 .

LI P C , YI J F , ZHANG L J . Query-efficient black-box attack by active learning [C ] // 2018 IEEE International Conference on Data Mining (ICDM) . Piscataway : IEEE , 2018 : 1200 - 1205 .

CHEN P Y , ZHANG H , SHARMA Y , et al . ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models [C ] // Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security . New York : ACM , 2017 : 15 - 26 .

PAPERNOT N , MCDANIEL P , I GOODFELLOW , et al . Practical black-box attacks against deep learning systems using adversarial examples [EB/OL ] . ( 2016-02-08 )[ 2023-09-27 ] . https://arxiv.org/pdf/1602.02697v2.pdf https://arxiv.org/pdf/1602.02697v2.pdf .

DUAN M X , LI K L , DENG J Y , et al . A novel multi-sample generation method for adversarial attacks [J ] . ACM Transactions on Multimedia Computing, Communications, and Applications , 2022 , 18 ( 4 ): 1 - 21 .

ZHOU W , HOU X , CHEN Y J , et al . Transferable adversarial perturbations [C ] // Computer Vision - ECCV 2018 . Cham : Springer International Publishing , 2018 : 471 - 486 .

HUANG Q A , KATSMAN I , GU Z Q , et al . Enhancing adversarial example transferability with an intermediate level attack [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 4732 - 4741 .

NASEER M , KHAN S H , RAHMAN S , et al . Task-generalizable adversarial attack based on perceptual metric [EB/OL ] . ( 2018-1122 )[ 2023-09-27 ] . https://arxiv.org/pdf/1811.‍09020.pdf https://arxiv.org/pdf/1811.‍09020.pdf .

GANESHAN A , VIVEK B S , RADHAKRISHNAN V B . FDA: Feature disruptive attack [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 8068 - 8078 .

WANG Z B , GUO H C , ZHANG Z F , et al . Feature importance-aware transferable adversarial attacks [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2022 : 7619 - 7628 .

ZHANG J P , WU W B , HUANG J T , et al . Improving adversarial transferability via neuron attribution-based attacks [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 14973 - 14982 .

孔锐 , 蔡佳纯 , 黄钢 . 基于生成对抗网络的对抗攻击防御模型 [J/OL ] . 自动化学报 , 2020 . DOI: 10.16383/j.aas.2020.c200033 http://dx.doi.org/10.16383/j.aas.2020.c200033 .

KONG R , CAI J C , HUANG G . Defense to adversarial attack with generative adversarial network [J/OL ] . Acta Automatica Sinica , 2020 . DOI: 10.16383/j.aas.2020.c200033. http://dx.doi.org/10.16383/j.aas.2020.c200033. (in Chinese)

SHAFASHI A , NAJIBI M , GHIASI M A , et al . Adversarial training for free! [C ] // Neural Information Processing Systems 32 . San Diego : NIPS , 2019 : 3358 - 3369 .

ZHU C , CHENG Y , GAN Z , et al . FreeLB: Enhanced adversarial training for natural language understanding [C ] // International Conference on Learning Representations . New Orleans : Ithaca, Computational and Biological Learning Society , 2019 : 1 - 14 .

ZHANG D , ZHANG T , LU Y , et al . You only propagate once: Accelerating adversarial training via maximal principle [C ] // Neural Information Processing Systems 32 . San Diego : NIPS , 2019 : 227 - 238 .

LI T , WU Y W , CHEN S Z , et al . Subspace adversarial training [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 13399 - 13408 .

SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vision [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 2818 - 2826 .

SZEGEDY C , IOFFE S , VANHOUCKE V , et al . Inception-v4, inception-ResNet and the impact of residual connections on learning [EB/OL ] . ( 2016-02-23 )[ 2023-04-28 ] . https://arxiv.org/abs/1602.07261 https://arxiv.org/abs/1602.07261 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [C/OL ] // The 3rd International Conference on Learning Representations (ICLR 2015). Ithaca: Computational and Biological Learning Society, 2015 . [2023-09-27] . https://www.mendeley.com/catalogue/949ce8a0-2d23-32b0-9dbb-b86394a92c62/ https://www.mendeley.com/catalogue/949ce8a0-2d23-32b0-9dbb-b86394a92c62/ .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

An Optimized Sparse Auto-encoder Network Based on Feature Clustering

Circular Trace Transform and Its Applications to Image Texture Analysis

Review on Image Scene Classification Technology

Abnormal Event Detection Based on Deep Learning

Related Author

FU Xiao

SHEN Yuan-tong

FU Li-hua

YANG Di-wei

WANG Yu-ling

LI Ming

TIAN Yan-ling

ZHANG Wei-tong

Related Institution

College of Mathematics and Physics, China University of Geosciences

College of Automation Engineering, Nanjing University of Aeronautics and Astronautics

Jiangxi Engineering Laboratory on Radioactive Geoscience and Big Data Technology, East China University of Technology

Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University

College of Automation Engineering Nanjing University of Aeronautics and Astronautics Nanjing Jiangsu China

⁰