Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods

YU Hong-xia; LU Lei-ji; BAO Lei; CHEN Jun; ZHANG Lin-jun

doi:10.12263/DZXB.20250578

您当前的位置：

首页 >

文章列表页 >

Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods

PAPERS | 更新时间：2025-12-27

- Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods
- ACTA ELECTRONICA SINICA Vol. 53, Issue 8, Pages: 2854-2863(2025)
- 作者机构：
  
  中国人民解放军陆军兵种大学信息工程系，安徽合肥 230031
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62076252)
- DOI：10.12263/DZXB.20250578
  CLC： TP391;
- Received：02 July 2025，
  
  Accepted：20 August 2025，
  
  Published：25 August 2025
- 稿件说明：
移动端阅览
余红霞, 鲁磊纪, 鲍蕾, 等. 基于类别相似性驱动的动态伪目标对抗攻击方法研究[J]. 电子学报, 2025, 53(08): 2854-2863.

YU Hong-xia, LU Lei-ji, BAO Lei, et al. Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods[J]. Acta Electronica Sinica, 2025, 53(08): 2854-2863.
余红霞, 鲁磊纪, 鲍蕾, 等. 基于类别相似性驱动的动态伪目标对抗攻击方法研究[J]. 电子学报, 2025, 53(08): 2854-2863. DOI：10.12263/DZXB.20250578

YU Hong-xia, LU Lei-ji, BAO Lei, et al. Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods[J]. Acta Electronica Sinica, 2025, 53(08): 2854-2863. DOI：10.12263/DZXB.20250578

摘要

本文针对深度神经网络对抗样本在黑盒攻击中迁移性受限的核心问题，提出了一种基于类别语义关联的动态伪目标对抗攻击框架.现有方法因忽视类别间语义关联性，导致对抗扰动容易陷入模型特异性过拟合，严重制约对抗样本跨模型迁移效能.研究表明，对抗样本在迁移过程中倾向于被误分类为语义相近的类别，而非随机类别，揭示出类别相似性是影响迁移性的关键因素.本研究通过挖掘模型特征空间中相似类别的共享对抗子空间特性，创新性地提出了基于类别相似性驱动的动态伪目标对抗攻击方法.首先，构建一个动态伪目标筛选策略.在每一次扰动迭代中，根据当前模型对对抗样本的预测置信度分布，从所有非正确类别中选取预测概率最高类别作为“伪目标”.该目标并非固定，而是随迭代过程自适应调整，确保扰动方向始终指向最具迁移潜力的语义区域.其次，提出双梯度协同更新框架.将原始基于真实类别的对抗损失梯度与伪目标类别的误导梯度进行线性加权融合，通过梯度场的叠加效应，扰动更新不仅能逃离源模型的决策边界，还能同时朝向多个模型共享的语义子空间推进，从而显著提升对抗样本的跨模型可迁移性.此外，本文方法具有广泛的兼容性与可扩展性，可作为一种通用优化机制与多种主流梯度攻击策略无缝融合，在每次梯度更新计算中，通过引入动态伪目标梯度项，在不破坏原方法梯度结构的前提下，显著增强其跨模型迁移能力.实验表明，本文方法在跨架构（Convolutional Neural Network，CNN/Transformer）、跨规模（轻量化模型）攻击场景下均展现出优越的迁移鲁棒性.此外，该方法展现出良好的兼容性，可与多种梯度攻击策略及数据增强计算结合，在单一攻击、组合攻击与集成攻击模式下均优于现有方法.本研究为对抗攻击提供了一种基于语义相似性的通用优化范式，为提升黑盒攻击的迁移性提供了新思路.

Abstract

This paper addresses the critical limitation of adversarial example transferability in black-box attacks for deep neural networks by proposing a dynamic fake target adversarial attack framework based on categorical semantic correlations. Existing methods often overlook inter-class semantic relationships

causing adversarial perturbations to overfit to model-specific features and severely restricting the adversarial example’s cross-model transferability. Studies have indicated that adversarial examples are more likely to be misclassified into semantically similar classes rather than arbitrary categories during the transfer process. This observation underscores the significance of class similarity as a pivotal factor influencing transferability. In this research

we innovatively propose a class-similarity-driven dynamic pseudo-targeted adversarial attack method by exploring the shared adversarial subspace characteristics among semantically analogous categories within the feature space. First

we establish a dynamic pseudo-target selection strategy. In each perturbation iteration

we identify the class with the highest predicted probability among all incorrect categories as the “pseudo-target”

based on the current model’s confidence distribution regarding the adversarial example. This pseudo-target is not fixed

instead

it is adaptively adjusted throughout the iterative process

ensuring that the perturbation direction consistently orients toward the most transferable semantic region. Second

we introduce a dual-gradient collaborative update framework. This framework integrates the adversarial loss gradient pertaining to the true class with the misleading gradient associated with the pseudo-target class through linear weighting. Leveraging the superposition effect in the gradient field

the perturbation update not only circumvents the decision boundary of the source model but also progresses into the shared semantic subspace of multiple models

thereby significantly enhancing the cross-model transferability of adversarial examples. Furthermore

our proposed method demonstrates wide compatibility and extensibility

serving as a versatile optimization mechanism that can be seamlessly integrated with various mainstream gradient-based attack strategies. During each gradient update

the incorporation of a dynamic pseudo-target gradient term markedly amplifies cross-model transfer capability without compromising the original gradient structure of the foundational method. Experimental results illustrate that the proposed approach exhibits superior transfer robustness in cross-architecture (e.g.

Convolutional Neural Networks and Transformers) and cross-scale (e.g.

lightweight models) adversarial attack scenarios. Additionally

it showcases excellent compatibility

enabling effective integration with diverse gradient attack strategies and data augmentation techniques

thereby outperforming existing methodologies across single

combined

and ensemble attack settings. This study proposes a general optimization paradigm based on semantic similarity for adversarial attacks

offering novel insights to enhance the transferability of black-box attacks.

关键词

Keywords

references

LIU J F , LI Y S , GUO Y M , et al . Generation and countermeasures of adversarial examples on vision: A survey [J ] . Artificial Intelligence Review , 2024 , 57 ( 8 ): 199 - 246 .

YANG B , ZHANG H W , WANG J D , et al . Adversarial example soups: Improving transferability and stealthiness for free [J ] . IEEE Transactions on Information Forensics and Security , 2025 , 20 : 1882 - 1894 .

GOODFELLOW I J , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples [EB/OL ] . ( 2015-03-20 )[ 2025-07-01 ] . https://arxiv.org/pdf/1412.6572 https://arxiv.org/pdf/1412.6572 .

MADRY A , MAKELOV A , SCHMIDT L , et al . Towards deep learning models resistant to adversarial attacks [EB/OL ] . ( 2017-06-19 )[ 2025-07-01 ] . https://arxiv.org/abs/1706.06083?context=cs https://arxiv.org/abs/1706.06083?context=cs .

鲍蕾 , 陶蔚 , 陶卿 . 结合自适应步长策略和数据增强机制提升对抗攻击迁移性 [J ] . 电子学报 , 2024 , 52 ( 1 ): 157 - 169 .

BAO L , TAO W , TAO Q . Boosting adversarial transferability through adaptive-learning-rate with data augmentation mechanism [J ] . Acta Electronica Sinica , 2024 , 52 ( 1 ): 157 - 169 . (in Chinese)

PAPERNOT N , MCDANIEL P , GOODFELLOW I , et al . Practical black-box attacks against deep learning systems using adversarial examples [EB/OL ] . ( 2016-02-18 )[ 2025-07-01 ] . https://arxiv.org/abs/1602.02697v2 https://arxiv.org/abs/1602.02697v2 .

ZOU J H , DUAN Y X , LI B Y , et al . Making adversarial examples more transferable and indistinguishable [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2022 , 36 ( 3 ): 3662 - 3670 .

PENG A J , LIN Z , ZENG H , et al . Boosting transferability of adversarial example via an enhanced Euler’s method [C ] // 2023 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2023 : 1 - 5 .

WAN C , HUANG F J . Adversarial attack based on prediction-correction [EB/OL ] . ( 2023-06-02 )[ 2025-07-01 ] . https://arXiv.org/abs/2306.01809 https://arXiv.org/abs/2306.01809 .

WANG J F , CHEN Z Y , JIANG K X , et al . Boosting the transferability of adversarial attacks with global momentum initialization [J ] . Expert Systems with Applications , 2024 , 255 : 124757 .

WARDE-FARLEY D , GOODFELLOW I . Adversarial perturbations of deep neural networks [M ] // Perturbations, Optimization, and Statistics . Cambridge : The MIT Press , 2016 : 311 - 342 .

MEI S B , ZHAO C L , NI B B , et al . Towards interpreting and utilizing symmetry property in adversarial examples [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto : Association for the Advancement of Artificial Intelligence , 2023 : 9126 - 9133 .

OZBULAK U , PINTOR M , VAN MESSEM A , et al . Evaluating adversarial attacks on ImageNet: A reality check on misclassification classes [EB/OL ] . ( 2021-11-22 )[ 2025-07-01 ] . https://arXiv.org/abs/2111.11056 https://arXiv.org/abs/2111.11056 .

王硕 , 徐茹枝 , 关志涛 . 基于主特征归因的对抗样本生成方法研究 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3137 - 3145 .

WANG S , XU R Z , GUAN Z T . Research on the generation of adversarial samples based on the attribution of principal features [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3137 - 3145 . (in Chinese)

吴骥 , 邵文泽 , 葛琦 , 等 . 一种基于迭代累积梯度的多层特征重要性攻击方法 [J ] . 电子学报 , 2024 , 52 ( 11 ): 3798 - 3808 .

WU J , SHAO W Z , GE Q , et al . A multi-layer feature importance attack method based on iterative accumulated gradients [J ] . Acta Electronica Sinica , 2024 , 52 ( 11 ): 3798 - 3808 . (in Chinese)

KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial machine learning at scale [EB/OL ] .( 2016-11-04 )[ 2025-07-01 ] . https://arxiv.org/abs/1611.01236?context=stat.ML https://arxiv.org/abs/1611.01236?context=stat.ML .

DONG Y P , LIAO F Z , PANG T Y , et al . Boosting adversarial attacks with momentum [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 9185 - 9193 .

LIN J D , SONG C B , HE K , et al . Nesterov accelerated gradient and scale invariance for adversarial attacks [EB/OL ] . ( 2020-02-03 )[ 2025-07-01 ] . https://arxiv.org/pdf/1908.06281 https://arxiv.org/pdf/1908.06281 .

XIE C H , ZHANG Z S , ZHOU Y Y , et al . Improving transferability of adversarial examples with input diversity [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 2725 - 2734 .

DONG Y P , PANG T Y , SU H , et al . Evading defenses to transferable adversarial examples by translation-invariant attacks [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 4307 - 4316 .

REBUFFI S A , GOWAL S , DAN C L , et al . Data augmentation can improve robustness [C ] // Proceedings of the 35th International Conference on Neural Information Processing Systems . New York : ACM , 2021 : 29935 - 29948 .

WANG X S , ZHANG Z L , ZHANG J P . Structure invariant transformation for better adversarial transferability [C ] // 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2023 : 4584 - 4596 .

LIU Y P , CHEN X Y , LIU C , et al . Delving into transferable adversarial examples and black-box attacks [EB/OL ] . ( 2016-11-08 )[ 2025-07-01 ] . https://arxiv.org/abs/1611.02770 https://arxiv.org/abs/1611.02770 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Boosting Adversarial Transferability Through Adaptive-Learning-Rate with Data Augmentation Mechanism

A Multi-Layer Feature Importance Attack Method Based on Iterative Accumulated Gradients

An Electromagnetic Signal Adversarial Examples Generation Method Based on Saliency Map

Perturbation Initialization, Adam-Nesterov and Quasi-Hyperbolic Momentum for Adversarial Examples

Related Author

BAO Lei

TAO Wei

TAO Qing

SUN Yu-bao

GE Qi

SHAO Wen-ze

WU Ji

ZHOU Xia

Related Institution

Department of Information Engineering， PLA Army Academy of Artillery and Air Defense

PLA Academy of Military Science

School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications

Engineering Research Center for Digital Forensics Ministry of Education, Nanjing University of Information Science and Technology

Wuhan Digital Engineering Institute

⁰