Central Government Guided Local Funds for Science and Technology Development(216Z0301G);National Natural Science Foundation of China(61379065);Natural Science Foundation of Hebei Province(F2019203285)
In order to effectively resist the misleading of the adversarial examples for deep neural network models
an inverse perturbation fusion generative adversarial network (IP-GAN) is proposed. This method makes full use of the adversarial perturbation information in adversarial examples
takes inverse perturbation as the starting point of the adversarial example defense method
and analyzes the effectiveness from the high-dimensional feature space. Drawing on the idea of the generative adversarial network
the generator architecture is used as a construction model to generate the corresponding inverse perturbation based on adversarial examples to obtain the reconstructed examples. Then
the deep neural network model is introduced to guide the direction of inverse perturbation optimization
and input the reconstruction examples into the deep neural network model to obtain the correct classification results. The experimental results show that the inverse perturbation constructed can eliminate adversarial perturbations effectively
and assist the DNN model to identify and classify adversarial examples correctly. Compared with the state-of-the-art defense methods
the defense success rates of the IP-GAN method on MNIST and ImageNet datasets are increased by 0.86% and 2.96%
respectively.
关键词
Keywords
references
SZEGEDY C , ZAREMBA W , SUTSKEVER I , et al . Intriguing properties of neural networks [C ] // Proceedings of the International Conference on Learning Representations . Banff : ICLR , 2014 : 1 - 10 .
IRFAN M M , ALI S , YAQOOB I , et al . Towards deep learning: A review on adversarial attacks [C ] // 2021 International Conference on Artificial Intelligence . Islamabad : IEEE , 2021 : 91 - 96 .
ZOU Jun-hua , DUAN Ye-xin , REN Chuan-lun , et al . Perturbation initialization, Adam-Nesterov and quasi-hyperbolic momentum for adversarial examples [J ] . Acta Electronica Sinica , 2022 , 50 ( 1 ): 207 - 216 . (in Chinese)
ZHANG J L , LI C . Adversarial examples: Opportunities and challenges [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2020 , 31 ( 7 ): 2578 - 2593 .
SAMANGOUEI P , KABKAB M , CHELLAPA R . Defense-GAN: Protecting classifiers against adversarial attacks using generative models [C ] // Proceedings of the International Conference on Learning Representations . Vancouver : ICLR , 2018 : 1 - 12 .
GOODFELLOW I , POUGET-ABADIE J , MIRZA M , et al . Generative adversarial nets [J ] . Advances in Neural Information Processing Systems , 2014 , 3 : 2672 - 2680 .
JIN G Q , SHEN S W , ZHANG D M , et al . APE-GAN: Adversarial perturbation elimination with GAN [C ] // Proceedings of the IEEE Conference on International Conference on Acoustics, Speech, and Signal Processing . Brighton : IEEE , 2019 : 3842 - 3846 .
HLIHOR P , VOLPI R , MALAGÒ L . Evaluating the robustness of defense mechanisms based on autoencoder reconstructions against Carlini-Wagner adversarial attacks [C ] // Proceedings of the Northern Lights Deep Learning Workshop . UiT The Arctic University of Norway : Septentrio Academic Publishing , 2020 : 1 - 6 .
CHEN JIN-YIN , WU CHANG-AN , ZHENG HAI-BIN , et al . Universal inverse perturbation defense against adversarial attacks [J/OL ] . Acta Automatica Sinica , 2021 : 1 - 20 . DOI: 10.16383/j.aas.c201077. http://dx.doi.org/10.16383/j.aas.c201077. (in Chinese)
ZHENG H B , CHEN J Y , HANG D , et al . GRIP-GAN: An attack-free defense through general robust inverse perturbation [J/OL ] . IEEE Transactions on Dependable and Secure Computing , 2021 : 1 - 18 . DOI: 10.1109/TDSC.2021.3124337 http://dx.doi.org/10.1109/TDSC.2021.3124337 .
KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial examples in the physical world [C ] // Proceeding of the International Conference on Learning Representations . Toulon : ICLR , 2019 : 1 - 13 .