

浏览全部资源
扫码关注微信
中国人民解放军陆军兵种大学信息工程系,安徽合肥 230031
Received:02 July 2025,
Accepted:20 August 2025,
Published:25 August 2025
移动端阅览
余红霞, 鲁磊纪, 鲍蕾, 等. 基于类别相似性驱动的动态伪目标对抗攻击方法研究[J]. 电子学报, 2025, 53(08): 2854-2863.
YU Hong-xia, LU Lei-ji, BAO Lei, et al. Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods[J]. Acta Electronica Sinica, 2025, 53(08): 2854-2863.
余红霞, 鲁磊纪, 鲍蕾, 等. 基于类别相似性驱动的动态伪目标对抗攻击方法研究[J]. 电子学报, 2025, 53(08): 2854-2863. DOI:10.12263/DZXB.20250578
YU Hong-xia, LU Lei-ji, BAO Lei, et al. Research on Category Similarity-Driven Dynamic Fake Target Adversarial Attack Methods[J]. Acta Electronica Sinica, 2025, 53(08): 2854-2863. DOI:10.12263/DZXB.20250578
本文针对深度神经网络对抗样本在黑盒攻击中迁移性受限的核心问题,提出了一种基于类别语义关联的动态伪目标对抗攻击框架.现有方法因忽视类别间语义关联性,导致对抗扰动容易陷入模型特异性过拟合,严重制约对抗样本跨模型迁移效能.研究表明,对抗样本在迁移过程中倾向于被误分类为语义相近的类别,而非随机类别,揭示出类别相似性是影响迁移性的关键因素.本研究通过挖掘模型特征空间中相似类别的共享对抗子空间特性,创新性地提出了基于类别相似性驱动的动态伪目标对抗攻击方法.首先,构建一个动态伪目标筛选策略.在每一次扰动迭代中,根据当前模型对对抗样本的预测置信度分布,从所有非正确类别中选取预测概率最高类别作为“伪目标”.该目标并非固定,而是随迭代过程自适应调整,确保扰动方向始终指向最具迁移潜力的语义区域.其次,提出双梯度协同更新框架.将原始基于真实类别的对抗损失梯度与伪目标类别的误导梯度进行线性加权融合,通过梯度场的叠加效应,扰动更新不仅能逃离源模型的决策边界,还能同时朝向多个模型共享的语义子空间推进,从而显著提升对抗样本的跨模型可迁移性.此外,本文方法具有广泛的兼容性与可扩展性,可作为一种通用优化机制与多种主流梯度攻击策略无缝融合,在每次梯度更新计算中,通过引入动态伪目标梯度项,在不破坏原方法梯度结构的前提下,显著增强其跨模型迁移能力.实验表明,本文方法在跨架构(Convolutional Neural Network,CNN/Transformer)、跨规模(轻量化模型)攻击场景下均展现出优越的迁移鲁棒性.此外,该方法展现出良好的兼容性,可与多种梯度攻击策略及数据增强计算结合,在单一攻击、组合攻击与集成攻击模式下均优于现有方法.本研究为对抗攻击提供了一种基于语义相似性的通用优化范式,为提升黑盒攻击的迁移性提供了新思路.
This paper addresses the critical limitation of adversarial example transferability in black-box attacks for deep neural networks by proposing a dynamic fake target adversarial attack framework based on categorical semantic correlations. Existing methods often overlook inter-class semantic relationships
causing adversarial perturbations to overfit to model-specific features and severely restricting the adversarial example’s cross-model transferability. Studies have indicated that adversarial examples are more likely to be misclassified into semantically similar classes rather than arbitrary categories during the transfer process. This observation underscores the significance of class similarity as a pivotal factor influencing transferability. In this research
we innovatively propose a class-similarity-driven dynamic pseudo-targeted adversarial attack method by exploring the shared adversarial subspace characteristics among semantically analogous categories within the feature space. First
we establish a dynamic pseudo-target selection strategy. In each perturbation iteration
we identify the class with the highest predicted probability among all incorrect categories as the “pseudo-target”
based on the current model’s confidence distribution regarding the adversarial example. This pseudo-target is not fixed
instead
it is adaptively adjusted throughout the iterative process
ensuring that the perturbation direction consistently orients toward the most transferable semantic region. Second
we introduce a dual-gradient collaborative update framework. This framework integrates the adversarial loss gradient pertaining to the true class with the misleading gradient associated with the pseudo-target class through linear weighting. Leveraging the superposition effect in the gradient field
the perturbation update not only circumvents the decision boundary of the source model but also progresses into the shared semantic subspace of multiple models
thereby significantly enhancing the cross-model transferability of adversarial examples. Furthermore
our proposed method demonstrates wide compatibility and extensibility
serving as a versatile optimization mechanism that can be seamlessly integrated with various mainstream gradient-based attack strategies. During each gradient update
the incorporation of a dynamic pseudo-target gradient term markedly amplifies cross-model transfer capability without compromising the original gradient structure of the foundational method. Experimental results illustrate that the proposed approach exhibits superior transfer robustness in cross-architecture (e.g.
Convolutional Neural Networks and Transformers) and cross-scale (e.g.
lightweight models) adversarial attack scenarios. Additionally
it showcases excellent compatibility
enabling effective integration with diverse gradient attack strategies and data augmentation techniques
thereby outperforming existing methodologies across single
combined
and ensemble attack settings. This study proposes a general optimization paradigm based on semantic similarity for adversarial attacks
offering novel insights to enhance the transferability of black-box attacks.
LIU J F , LI Y S , GUO Y M , et al . Generation and countermeasures of adversarial examples on vision: A survey [J ] . Artificial Intelligence Review , 2024 , 57 ( 8 ): 199 - 246 .
YANG B , ZHANG H W , WANG J D , et al . Adversarial example soups: Improving transferability and stealthiness for free [J ] . IEEE Transactions on Information Forensics and Security , 2025 , 20 : 1882 - 1894 .
GOODFELLOW I J , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples [EB/OL ] . ( 2015-03-20 )[ 2025-07-01 ] . https://arxiv.org/pdf/1412.6572 https://arxiv.org/pdf/1412.6572 .
MADRY A , MAKELOV A , SCHMIDT L , et al . Towards deep learning models resistant to adversarial attacks [EB/OL ] . ( 2017-06-19 )[ 2025-07-01 ] . https://arxiv.org/abs/1706.06083?context=cs https://arxiv.org/abs/1706.06083?context=cs .
鲍蕾 , 陶蔚 , 陶卿 . 结合自适应步长策略和数据增强机制提升对抗攻击迁移性 [J ] . 电子学报 , 2024 , 52 ( 1 ): 157 - 169 .
BAO L , TAO W , TAO Q . Boosting adversarial transferability through adaptive-learning-rate with data augmentation mechanism [J ] . Acta Electronica Sinica , 2024 , 52 ( 1 ): 157 - 169 . (in Chinese)
PAPERNOT N , MCDANIEL P , GOODFELLOW I , et al . Practical black-box attacks against deep learning systems using adversarial examples [EB/OL ] . ( 2016-02-18 )[ 2025-07-01 ] . https://arxiv.org/abs/1602.02697v2 https://arxiv.org/abs/1602.02697v2 .
ZOU J H , DUAN Y X , LI B Y , et al . Making adversarial examples more transferable and indistinguishable [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2022 , 36 ( 3 ): 3662 - 3670 .
PENG A J , LIN Z , ZENG H , et al . Boosting transferability of adversarial example via an enhanced Euler’s method [C ] // 2023 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2023 : 1 - 5 .
WAN C , HUANG F J . Adversarial attack based on prediction-correction [EB/OL ] . ( 2023-06-02 )[ 2025-07-01 ] . https://arXiv.org/abs/2306.01809 https://arXiv.org/abs/2306.01809 .
WANG J F , CHEN Z Y , JIANG K X , et al . Boosting the transferability of adversarial attacks with global momentum initialization [J ] . Expert Systems with Applications , 2024 , 255 : 124757 .
WARDE-FARLEY D , GOODFELLOW I . Adversarial perturbations of deep neural networks [M ] // Perturbations, Optimization, and Statistics . Cambridge : The MIT Press , 2016 : 311 - 342 .
MEI S B , ZHAO C L , NI B B , et al . Towards interpreting and utilizing symmetry property in adversarial examples [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto : Association for the Advancement of Artificial Intelligence , 2023 : 9126 - 9133 .
OZBULAK U , PINTOR M , VAN MESSEM A , et al . Evaluating adversarial attacks on ImageNet: A reality check on misclassification classes [EB/OL ] . ( 2021-11-22 )[ 2025-07-01 ] . https://arXiv.org/abs/2111.11056 https://arXiv.org/abs/2111.11056 .
王硕 , 徐茹枝 , 关志涛 . 基于主特征归因的对抗样本生成方法研究 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3137 - 3145 .
WANG S , XU R Z , GUAN Z T . Research on the generation of adversarial samples based on the attribution of principal features [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3137 - 3145 . (in Chinese)
吴骥 , 邵文泽 , 葛琦 , 等 . 一种基于迭代累积梯度的多层特征重要性攻击方法 [J ] . 电子学报 , 2024 , 52 ( 11 ): 3798 - 3808 .
WU J , SHAO W Z , GE Q , et al . A multi-layer feature importance attack method based on iterative accumulated gradients [J ] . Acta Electronica Sinica , 2024 , 52 ( 11 ): 3798 - 3808 . (in Chinese)
KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial machine learning at scale [EB/OL ] .( 2016-11-04 )[ 2025-07-01 ] . https://arxiv.org/abs/1611.01236?context=stat.ML https://arxiv.org/abs/1611.01236?context=stat.ML .
DONG Y P , LIAO F Z , PANG T Y , et al . Boosting adversarial attacks with momentum [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 9185 - 9193 .
LIN J D , SONG C B , HE K , et al . Nesterov accelerated gradient and scale invariance for adversarial attacks [EB/OL ] . ( 2020-02-03 )[ 2025-07-01 ] . https://arxiv.org/pdf/1908.06281 https://arxiv.org/pdf/1908.06281 .
XIE C H , ZHANG Z S , ZHOU Y Y , et al . Improving transferability of adversarial examples with input diversity [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 2725 - 2734 .
DONG Y P , PANG T Y , SU H , et al . Evading defenses to transferable adversarial examples by translation-invariant attacks [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 4307 - 4316 .
REBUFFI S A , GOWAL S , DAN C L , et al . Data augmentation can improve robustness [C ] // Proceedings of the 35th International Conference on Neural Information Processing Systems . New York : ACM , 2021 : 29935 - 29948 .
WANG X S , ZHANG Z L , ZHANG J P . Structure invariant transformation for better adversarial transferability [C ] // 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2023 : 4584 - 4596 .
LIU Y P , CHEN X Y , LIU C , et al . Delving into transferable adversarial examples and black-box attacks [EB/OL ] . ( 2016-11-08 )[ 2025-07-01 ] . https://arxiv.org/abs/1611.02770 https://arxiv.org/abs/1611.02770 .
0
Views
7
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621