基于引导扩散模型的自然对抗补丁生成方法

何琨; 佘计思; 张子君; 陈晶; 汪欣欣; 杜瑞颖

doi:10.12263/DZXB.20230481

PDF(1790 KB)

电子学报 ›› 2024, Vol. 52 ›› Issue (2) : 564-573. DOI: 10.12263/DZXB.20230481

学术论文

基于引导扩散模型的自然对抗补丁生成方法

作者信息 +

^1. 武汉大学国家网络安全学院, 湖北武汉 430072

^2. 武汉大学空天信息安全与可信计算教育部重点实验室, 湖北武汉 430072

^3. 武汉大学日照信息技术研究院, 山东日照 276800

^4. 地球空间信息技术协同创新中心, 湖北武汉 430079

作者简介:

何琨男，1986年10月出生于湖北省武汉市.博士.现为武汉大学副教授、博士生导师.主要研究方向为应用密码学、网络安全、云计算安全、人工智能安全和区块链安全等.中国电子学会会员编号：E190156480M. E-mail: hekun@whu.edu.cn

佘计思女，1999年7月出生于湖北省随州市.现为武汉大学在读硕士生.主要研究方向为人工智能安全、目标检测. E-mail: shejisi@whu.edu.cn

张子君男，1989年4月出生于湖北省武汉市.博士.现为武汉大学副教授.主要研究方向为神经网络优化算法、正则化、网络架构、表示学习和强化学习等. E-mail: zijunzhang@whu.edu.cn

陈晶男，1981年3月出生于湖北省武汉市.博士.现为武汉大学教授、博士生导师.主要研究方向为网络安全、人工智能安全、分布式系统安全和区块链等. E-mail: chenjing@whu.edu.cn

汪欣欣女，1995年8月出生于湖北省随州市.现为武汉大学在读博士生.主要研究方向为目标检测、对抗学习和后门学习等. E-mail: xinelwang@whu.edu.cn

杜瑞颖女，1964年10月出生于河南省新乡市.博士.现为武汉大学教授、博士生导师.主要研究方向为网络安全、隐私保护、云安全和移动安全等. E-mail: duraying@whu.edu.cn

通信作者:

张子君

折叠

A Guided Diffusion-based Approach to Natural Adversarial Patch Generation

Author information +

文章历史 +

摘要

近年来，物理世界中的对抗补丁攻击因其对深度学习模型安全的影响而引起了广泛关注.现有的工作主要集中在生成在物理世界中攻击性能良好的对抗补丁，没有考虑到对抗补丁图案与自然图像的差别，因此生成的对抗补丁往往不自然且容易被观察者发现.为了解决这个问题，本文提出了一种基于引导的扩散模型的自然对抗补丁生成方法.具体而言，本文通过解析目标检测器的输出构建预测对抗补丁攻击成功率的预测器，利用该预测器的梯度作为条件引导预训练的扩散模型的逆扩散过程，从而生成自然度更高且保持高攻击成功率的对抗补丁.本文在数字世界和物理世界中进行了广泛的实验，评估了对抗补丁针对各种目标检测模型的攻击效果以及对抗补丁的自然度.实验结果表明，通过将所构建的攻击成功率预测器与扩散模型相结合，本文的方法能够生成比现有方案更自然的对抗补丁，同时保持攻击性能.

Abstract

Adversarial patch attacks in the physical world have gained a lot of attention in recent years due to their safety implications. Existing work has mostly focused on generating adversarial patches that can attack certain models in the physical world, but the resulting patterns are often unnatural and easy to identify. To tackle this problem, we propose a guided diffusion-based approach to natural adversarial patch generation. Specifically, we construct a predictor for attack success rate (ASR) prediction by parsing the output of the target detector, such that the reverse process of a pre-trained diffusion model can be guided by the gradient of the classifier to generate adversarial patches with improved naturalness and high ASR. We conduct extensive experiments in both the digital and the physical worlds to evaluate the attack effectiveness against various object detection models, as well as the naturalness of generated patches. The experimental results show that by combining the ASR predictor with a pre-trained diffusion model, our method is able to produce more natural adversarial patches than the state-of-art approaches while remaining highly effective.

导出引用

何琨 , 佘计思 , 张子君 , 陈晶 , 汪欣欣 , 杜瑞颖. 基于引导扩散模型的自然对抗补丁生成方法[J]. 电子学报, 2024, 52(2): 564-573. https://doi.org/10.12263/DZXB.20230481

HE Kun , SHE Ji-si , ZHANG Zi-jun , CHEN Jing , WANG Xin-xin , DU Rui-ying. A Guided Diffusion-based Approach to Natural Adversarial Patch Generation[J]. Acta Electronica Sinica, 2024, 52(2): 564-573. https://doi.org/10.12263/DZXB.20230481

参考文献

原文顺序 | 文献年度倒序 | 文中引用次数倒序

1	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 本文引用 [1]

2	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.

3	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2818-2826. 本文引用 [1]

4	DENG J K, GUO J, VERVERAS E, et al. RetinaFace: single-shot multi-level face localisation in the wild[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 5202-5211. 本文引用 [1]

5	CHEN C Y, SEFF A, KORNHAUSER A, et al. DeepDriving: learning affordance for direct perception in autonomous driving[C]//2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 2722-2730. 本文引用 [1]

6	GE Y F, ZHANG Q, SUN Y T, et al. Grayscale medical image segmentation method based on 2D&3D object detection with deep learning[J]. BMC Medical Imaging, 2022, 22(1): 33. 本文引用 [1]

7	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[C]//2nd International Conference on Learning Representations. Banff: OpenReview.net, 2014: 1-10. 本文引用 [2]

8	SONG D, EYKHOLT K, EVTIMOV I, et al. Physical adversarial examples for object detectors[C]//12th USENIX Workshop on Offensive Technologies. Berkeley: USENIX Association, 2018: 1-10. 本文引用 [2]

9	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[C]//3rd International Conference on Learning Representations. San Diego: OpenReview.net, 2015: 1-11. 本文引用 [1]

10	MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2574-2582.

11	CARLINI N, WAGNER D A, CARLINI N,et al. Towards evaluating the robustness of neural networks[C]//2017 IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2017: 39-57.

12	MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]//6th International Conference on Learning Representations. Vancouver: OpenReview.net, 2018: 1-23. 本文引用 [1]

13	KURAKIN A, GOODFELLOW I J, BENGIO S. Adversarial examples in the physical world[C]//5th International Conference on Learning Representations. Toulon: OpenReview.net, 2017: 1-14. 本文引用 [1]

14	SHARIF M, BHAGAVATULA S, BAUER L, et al. Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2016: 1528-1540. 本文引用 [1]

15	BOSE A J, AARABI P. Adversarial attacks on face detectors using neural net based constrained optimization[C]//2018 IEEE 20th International Workshop on Multimedia Signal Processing. Piscataway: IEEE, 2018: 1-6. 本文引用 [1]

16	CHEN S T, CORNELIUS C, MARTIN J, et al. ShapeShifter: Robust physical adversarial attack on faster R-CNN object detector[C]//European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2019: 52-68. 本文引用 [1]

17	WANG J K, LIU A S, YIN Z X, et al. Dual attention suppression attack: Generate adversarial camouflage in physical world[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8561-8570. 本文引用 [1]

18	THYS S, VAN RANST W, GOEDEME T. Fooling automated surveillance cameras: adversarial patches to attack person detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2019: 49-55. 本文引用 [8]

19	HUANG L F, GAO C Y, ZHOU Y Y, et al. Universal physical camouflage attacks on object detectors[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 717-726. 本文引用 [2]

20	XU K D, ZHANG G Y, LIU S J, et al. Adversarial T-shirt! evading person detectors in a physical world[C]//Computer Vision - ECCV 2020. Cham: Springer, 2020: 665-681. 本文引用 [1]

21	WU Z X, LIM S N, DAVIS L S, et al. Making an invisibility cloak: Real world adversarial attacks on object detectors[C]//Computer Vision - ECCV 2020. Cham: Springer, 2020: 1-17. 本文引用 [4]

22	AURDAL L, LKKEN K H, KLAUSEN R A, et al. Adversarial camouflage for naval vessels[C]//Artificial Intelligence and Machine Learning in Defense Applications. Bellingham: SPIE, 2019, 11169: 163-174. 本文引用 [1]

23	ADHIKARI A, den HOLLANDER R, TOLIOS I, et al. Adversarial patch camouflage against aerial detection[C]//Artificial Intelligence and Machine Learning in Defense Applications II. Bellingham: SPIE, 2020: 115430F. 本文引用 [1]

24	LEI X C, CAI X, LU C, et al. Using frequency attention to make adversarial patch powerful against person detector[J]. IEEE Access, 2023, 11: 27217-27225. 本文引用 [1]

25	ZHAO Z Y, LIU Z R, LARSON M. Towards large yet imperceptible adversarial image perturbations with perceptual color distance[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1036-1045. 本文引用 [1]

26	LIU H T D, TAO M, LI C L, et al. Beyond pixel norm-balls: Parametric adversaries using an analytically differentiable renderer[C]//5th International Conference on Learning Representations. New Orleans: OpenReview.net, 2019: 1-14. 本文引用 [1]

27	HOSSEINI H, POOVENDRAN R. Semantic adversarial examples[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2018: 1695-1700.

28	HU Z H, HUANG S Y, ZHU X P, et al. Adversarial texture for fooling person detectors in the physical world[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 13297-13306. 本文引用 [1]

29	DUAN R J, MA X J, WANG Y S, et al. Adversarial camouflage: Hiding physical-world attacks with natural styles[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 997-1005. 本文引用 [1]

30	GUESMI A, BILASCO I M, SHAFIQUE M, et al. AdvART: Adversarial art for camouflaged object detection attacks[EB/OL]. (2023-03-03)[2023-05-20]. https://arxiv.org/abs/2303.01734 本文引用 [1]

31	HU Y C T, CHEN J C, KUNG B H, et al. Naturalistic physical adversarial patch for object detectors[C]//2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 7828-7837. 本文引用 [7]

32	DOAN B G, XUE M H, MA S Q, et al. TnT attacks! universal naturalistic adversarial patches against deep neural network systems[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 3816-3830. 本文引用 [1]

33	DHARIWAL P, NICHOL A Q. Diffusion models beat GANs on image synthesis[C]//Advances in Neural Information Processing Systems 34. La Jolla: Curran Associates, 2021: 8780-8794. 本文引用 [5]

34	SONG Y, ERMON S. Generative modeling by estimating gradients of the data distribution[C]//Advances in Neural Information Processing Systems 32. La Jolla: Curran Associates, 2019: 11895-11907. 本文引用 [2]

35	ATHALYE A, ENGSTROM L, ILYAS A, et al. Synthesizing robust adversarial examples[C]//Proceedings of the 35th International Conference on Machine Learning. San Diego: PMLR, 2018: 284-293. 本文引用 [1]

36	HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]//Advances in Neural Information Processing Systems 33. La Jolla: Curran Associates, 2020: 6840-6851. 本文引用 [3]

37	TAN J, JI N, XIE H D, et al. Legitimate adversarial patches: evading human eyes and detection models in the physical world[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 5307-5315. 本文引用 [1]

38	MITTAL A, MOORTHY A K, BOVIK A C. No-reference image quality assessment in the spatial domain[J]. IEEE Transactions on Image Processing, 2012, 21(12): 4695-4708. 本文引用 [1]

39	MITTAL A, SOUNDARARAJAN R, BOVIK A C. Making a "completely blind" image quality analyzer[J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. 本文引用 [1]

40	VENKATANATH N, PRANEETH D, BH M C, et al. Blind image quality evaluation using perception based features[C]//2015 Twenty First National Conference on Communications. Piscataway: IEEE, 2015: 1-6. 本文引用 [1]

41	REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]//30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. 本文引用 [1]

42	REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. (2018-04-08)[2023-05-20]. http://arxiv.org/abs/1804.02767 本文引用 [1]

基金

国家重点研发计划项目(2022YFB3102100)

中央高校基本科研业务费专项资金(2042022kf1034)

国家自然科学基金(62206203)

湖北省重点研发计划项目(2022BAA039)

山东省重点研发计划项目(2022CXPT055)

PDF(1790 KB)

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

收稿日期	修回日期	出版日期
2023-05-30	2024-01-07	2024-02-25
在线预览日期	发布日期
2024-01-22	2024-04-02

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金