一种基于迭代累积梯度的多层特征重要性攻击方法

吴骥; 邵文泽; 葛琦; 孙玉宝

doi:10.12263/DZXB.20230843

您当前的位置：

首页 >

文章列表页 >

一种基于迭代累积梯度的多层特征重要性攻击方法

学术论文 | 更新时间：2025-12-11

- 一种基于迭代累积梯度的多层特征重要性攻击方法
- A Multi-Layer Feature Importance Attack Method Based on Iterative Accumulated Gradients
- 电子学报 2024年52卷第11期页码：3798-3808
- 作者机构：
  
  1.南京邮电大学通信与信息工程学院，江苏南京 210003
  2.南京信息工程大学教育部数字取证工程研究中心，江苏南京 210044
- 作者简介：
  
  [ "吴骥男，1999年4月出生于河北省唐山市.现为南京邮电大学硕士研究生.主要研究方向为神经网络可解释，对抗攻击. E-mail: wj233enter@163.com" ]
  [ "邵文泽男，1981年10月出生于江苏省连云港市.现为南京邮电大学通信与信息工程学院教授.主要研究方向为计算成像反问题，神经拟态计算，可信人工智能. E-mail: shaowenze@njupt.edu.cn" ]
  [ "葛琦女，1984年1月出生于江苏省南通市.现为南京邮电大学通信与信息工程学院副教授.主要研究方向为计算机视觉，张量分解，深度学习. E-mail: geq@njupt.edu.cn" ]
  [ "孙玉宝男，1983年5月出生于江苏省连云港市.现为南京信息工程大学计算机学院教授.主要研究方向为深度学习，模式识别，高光谱遥感影像处理与分析. E-mail: sunyb@nuist.edu.cn" ]
- 基金信息：
  
  国家自然科学基金(61771250;61972213)
- DOI：10.12263/DZXB.20230843
  中图分类号： TP183
- 收稿：2023-09-07，
  
  修回：2024-02-03，
  
  纸质出版：2024-11-25
- 稿件说明：
移动端阅览
吴骥, 邵文泽, 葛琦, 等. 一种基于迭代累积梯度的多层特征重要性攻击方法[J]. 电子学报, 2024, 52(11): 3798-3808.

WU Ji, SHAO Wen-ze, GE Qi, et al. A Multi-Layer Feature Importance Attack Method Based on Iterative Accumulated Gradients[J]. Acta Electronica Sinica, 2024, 52(11): 3798-3808.
吴骥, 邵文泽, 葛琦, 等. 一种基于迭代累积梯度的多层特征重要性攻击方法[J]. 电子学报, 2024, 52(11): 3798-3808. DOI：10.12263/DZXB.20230843

WU Ji, SHAO Wen-ze, GE Qi, et al. A Multi-Layer Feature Importance Attack Method Based on Iterative Accumulated Gradients[J]. Acta Electronica Sinica, 2024, 52(11): 3798-3808. DOI：10.12263/DZXB.20230843

摘要

对抗样本的可迁移性对于攻击未知模型至关重要，这在实际场景中为对抗性攻击提供了可行性.现有的迁移攻击倾向于通过不加选择地扭曲特征来降低源模型的预测精度，但是忽略了图像中目标的内在特征.受到现有关于提取特征重要性工作的启发，本文提出一种多层累积梯度攻击方法，以破坏主导模型决策的重要目标感知特征.具体而言，本文通过引入迭代累积梯度来获得特征重要性，这种梯度将与目标主体部分高度相关，从而帮助实现更好的迁移攻击.进一步，本文在不同中间层进行组合攻击，最终实现了多层累积梯度攻击.大量结果表明，相较对比实验中的最好方法，本文所提方法在正常训练模型下以更高的攻击效率取得了与之相当的攻击成功率，而在防御模型下的攻击成功率提高了2.6个百分点.

Abstract

The transferability of adversarial samples is crucial for attacking unknown models

providing feasibility for adversarial attacks in practical scenarios. Existing transfer attacks tend to indiscriminately distort features to degrade prediction accuracy of the source model. However

they overlook the intrinsic features of objects in the images. Inspired by existing work on feature importance extraction

this paper proposes a method termed multi-layer accumulated gradient attack

which disrupts crucial object-aware features that dominate the model decision. Specifically

this paper introduces the iterative accumulated gradients to quantify feature importance

which are highly correlated with the target object and helpful to improve transfer attacks. Furthermore

combining attacks across various intermediate layers

this paper finally achieves multi-layer accumulated gradient attack. Compared with the best performing method

experimental results demonstrate a more efficient performance of the proposed one

the attacking success rates of which are comparable as to the normally trained models while increased by 2.6 percentage points as to the defense models.

关键词

Keywords

references

GOODFELLOW I J , SHLENS J , SZEGEDY C , et al . Explaining and harnessing adversarial examples [EB/OL ] . ( 2015-05-20 )[ 2023-05-01 ] . http://arxiv.org/abs/1412.6572v3 http://arxiv.org/abs/1412.6572v3 .

KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial Examples in the Physical World [EB/OL ] . ( 2017-02-11 )[ 2023-05-01 ] . http://arxiv.org/abs/1607.02533v4 http://arxiv.org/abs/1607.02533v4 .

DONG Y P , LIAO F Z , PANG T Y , et al . Boosting adversarial attacks with momentum [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 9185 - 9193 .

GANESHAN A , VIVEK B S , RADHAKRISHNAN V B . FDA: Feature disruptive attack [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 8068 - 8078 .

NASEER M , KHAN S H , RAHMAN S , et al . Task-generalizable adversarial attack based on perceptual metric [EB/OL ] . ( 2019-03-26 )[ 2023-05-01 ] . http://arxiv.org/abs/1811.09020v3 http://arxiv.org/abs/1811.09020v3 .

WANG Z B , GUO H C , ZHANG Z F , et al . Feature importance-aware transferable adversarial attacks [C ] // 2021 IEEE/CVF International Conference on Computer Visi-on (ICCV) . Piscataway : IEEE , 2021 : 7619 - 7628 .

CHEN S Z , HE Z B , SUN C J , et al . Universal adversarial attack on attention and the resulting dataset DAmage-Net [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 4 ): 2188 - 2197 .

IWANA B K , KUROKI R , UCHIDA S . Explaining convolutional neural networks using softmax gradient layer-wise relevance propagation [C ] // 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) . Piscataway : IEEE , 2019 : 4176 - 4185 .

SELVARAJU R R , COGSWELL M , DAS A , et al . Grad-CAM: Visual explanations from deep networks via gradient-based localization [J ] . International Journal of Computer Vision , 2020 , 128 ( 2 ): 336 - 359 .

WU W B , SU Y X , CHEN X X , et al . Boosting the transferability of adversarial samples via attention [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1158 - 1167 .

JIANG P T , ZHANG C B , HOU Q B , et al . LayerCAM: Exploring hierarchical class activation maps for localization [J ] . IEEE Transactions on Image Processing , 2021 , 30 : 5875 - 5888 .

SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vis-ion [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 2818 - 2826 .

SZEGEDY C , IOFFE S , VANHOUCKE V , et al . Inception-v4, inception-ResNet and the impact of residual connections on learning [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2017 , 31 ( 1 ): 4278 - 4284 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recogniti-on [EB/OL ] . ( 2015-04-10 )[ 2023-05-01 ] . https://arXiv.org/abs/1409.1556v6 https://arXiv.org/abs/1409.1556v6 .

KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial machine learning at scale [EB/OL ] . ( 2017-02-11 )[ 2023-05-01 ] . http://arxiv.org/abs/1611.01236v2 http://arxiv.org/abs/1611.01236v2 .

TRAMÈR F , KURAKIN A , PAPERNOT N , et al . Ensemble adversarial training:Attacks and defenses [EB/OL ] . ( 2020-04-26 )[ 2023-05-01 ] . http://arxiv.org/abs/1705.0720 4v5 http://arxiv.org/abs/1705.07204v5 .

XIE C H , ZHANG Z S , ZHOU Y Y , et al . Improving transferability of adversarial examples with input diversity [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 2725 - 2734 .

DONG Y P , PANG T Y , SU H , et al . Evading defenses to transferable adversarial examples by translation-invariant attacks [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 4307 - 4316 .

GAO L L , ZHANG Q L , SONG J K , et al . Patch-wise attack for fooling deep neural network [M ] // Lecture Notes in Computer Science . Cham : Springer International Publishing , 2020 : 307 - 322 .

邹军华 , 段晔鑫 , 任传伦 , 等 . 基于噪声初始化、Adam-Nesterov方法和准双曲动量方法的对抗样本生成方法 [J ] . 电子学报 , 2022 , 50 ( 1 ): 207 - 216 .

ZOU J H , DUAN Y X , REN C L , et al . Perturbation initialization, adam-nesterov and quasi-hyperbolic momentum for adversarial Examples [J ] . Acta Electronica Sinica , 2022 , 50 ( 1 ): 207 - 216 . (in Chinese)

张世辉 , 张晓微 , 宋丹丹 , 等 . 基于逆扰动融合生成对抗网络的对抗样本防御方法 [J ] . 电子学报 , 2023 , 51 ( 4 ): 879 - 884 .

ZHANG S H , ZHANG X W , SONG D D , et al . Adversarial example defense method based on inverse perturbation fusing generative adversarial network [J ] . Acta Electronica Sinica , 2023 , 51 ( 4 ): 879 - 884 . (in Chinese)

INKAWHICH N , LIANG K J , CARIN L , et al . Tra-nsferable perturbations of deep feature distributio-ns [EB/OL ] . ( 2020-04-27 )[ 2023-05-01 ] . https://arxiv.org/abs/ 2004.12519 https://arxiv.org/abs/2004.12519 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于类别相似性驱动的动态伪目标对抗攻击方法研究

基于频域多目标优化的SAR图像对抗样本生成方法

基于决策边界光滑度的深度学习模型对抗鲁棒性评估指标

结合自适应步长策略和数据增强机制提升对抗攻击迁移性