面向鲁棒学习的对抗训练技术综述

隋晨红; 王奥; 周圣文; 臧安康; 潘云豪; 刘颢; 王海鹏

doi:10.11834/jig.220953

综述 | 浏览量 : 0 下载量: 3 CSCD: 1

PDF
导出
分享
收藏
专辑

面向鲁棒学习的对抗训练技术综述
A survey on adversarial training for robust learning
2023年28卷第12期页码：3629-3650
纸质出版日期： 2023-12-16 ，
DOI： 10.11834/jig.220953
稿件说明：

移动端阅览

隋晨红，王奥，周圣文，臧安康，潘云豪，刘颢，王海鹏. 2023. 面向鲁棒学习的对抗训练技术综述. 中国图象图形学报， 28(12):3629-3650

Sui Chenhong， Wang Ao， Zhou Shengwen， Zang Ankang， Pan Yunhao， Liu Hao， Wang Haipeng. 2023. A survey on adversarial training for robust learning. Journal of Image and Graphics， 28(12):3629-3650
隋晨红，王奥，周圣文，臧安康，潘云豪，刘颢，王海鹏. 2023. 面向鲁棒学习的对抗训练技术综述. 中国图象图形学报， 28(12):3629-3650 DOI： 10.11834/jig.220953.

Sui Chenhong， Wang Ao， Zhou Shengwen， Zang Ankang， Pan Yunhao， Liu Hao， Wang Haipeng. 2023. A survey on adversarial training for robust learning. Journal of Image and Graphics， 28(12):3629-3650 DOI： 10.11834/jig.220953.

摘要

深度学习在众多领域取得了巨大成功。然而，其强大的数据拟合能力隐藏着不可解释的“捷径学习”现象，从而引发深度模型脆弱、易受攻击的安全隐患。众多研究表明，攻击者向正常数据中添加人类无法察觉的微小扰动，便可能造成模型产生灾难性的错误输出，这严重限制了深度学习在安全敏感领域的应用。对此，研究者提出了各种对抗性防御方法。其中，对抗训练是典型的启发式防御方法。它将对抗攻击与对抗防御注入一个框架，一方面通过攻击已有模型学习生成对抗样本，另一方面利用对抗样本进一步开展模型训练，从而提升模型的鲁棒性。为此，本文围绕对抗训练，首先，阐述了对抗训练的基本框架；其次，对对抗训练框架下的对抗样本生成、对抗模型防御性训练等方法与关键技术进行分类梳理；然后，对评估对抗训练鲁棒性的数据集及攻击方式进行总结；最后，通过对当前对抗训练所面临挑战的分析，本文给出了其未来的几个发展方向。

Abstract

Deep learning has achieved great success in many fields. However， the solid data-fitting ability of deep learning hides the unexplained phenomenon of “shortcut learning”， which leads to the vulnerability of the deep model. Many studies have shown that if an attacker adds slight perturbations to normal data that human beings cannot perceive， the model may produce catastrophic wrong output， which severely limits the application of deep learning in security-sensitive fields. Therefore， to deal with the threat of malicious attacks， an antagonistic defense should be set up， and the robustness of the model should be improved. In this regard， researchers have proposed a variety of adversarial defense methods. The existing defense methods for deep neural networks can be divided into three categories， namely， modifying-input-data-based methods， directly-enhancing-network-based methods， and adversarial-training-based methods. Modifying-input-data-based defense methods aim to alter the input in advance and reduce the attack intensity at the input end via denoising or image transformation， among others. Despite showing a certain anti-attack ability， this method is not only limited by the attack intensity but also faces the problem of over-correcting the normal input data. The former limitation hinders this method from dealing with slight disturbances that human beings cannot perceive， while the latter problem exposes this method to the risk of making wrong judgments on normal data， thereby reducing its classification accuracy. Directly-enhancing-network-based methods directly improve the anti-attack capability of the network by adding subnetworks and by changing the loss function， activation function， batch normalization layer， or network training process. Adversarial-training-based methods are typical heuristic defense methods compared with the other two. These methods inject the adversary attack and adversary defense into a framework， wherein adversarial examples are initially generated by attacking the existing models. Afterward， these adversarial examples are used to train the target model to produce an accurate output for these examples and enhance its robustness. Therefore， this paper primarily focuses on adversarial training. Apart from showing certain ability to defend against attacks， adversarial training also improves the robustness of the model at the cost of reducing its classification or recognition accuracy for normal data. Many researchers find that the more robust the model is， the lower is its classification or recognition accuracy for normal examples. In addition， the defense effect of the current adversarial training remains unsatisfactory for strong adversarial attacks with diversified attack modes. To address this issue， recent studies have improved the standard adversarial training from different perspectives. For instance， some studies have generated adversarial examples with high diversity or portability in the attack stage. To enhance model robustness， many scholars have combined adversarial training with network enhancement to resist an adversarial attack. This process involves network structure modification， model parameters adjustment， and adversarial training acceleration， which help the model resist different types of attacks. The standard adversarial training only considers the classification of adversarial examples in the defense stage and ignores the classification accuracy for the original examples. In this connection， many works not only introduce the spatial or semantic consistency constraints between the original and adversarial examples but also require the model to produce an accurate output with respect to the latter， thus ensuring that the model considers both robustness and accuracy. To enhance the transferability of the model， curriculum learning， reinforcement learning， metric learning， and domain adaptation technologies are integrated into adversarial training. This paper then comprehensively reviews adversarial training technologies. First， the basic framework of adversarial training is elaborated. Second， typical methods and critical technologies for the generation of adversarial samples are reviewed. We summarize the adversarial examples generation methods based on image space， feature space， and physical space attacks. To improve the diversity of adversarial examples， we also introduce interpolation- and reinforcement-learning-related adversarial example generation strategies. Given that standard adversarial training is extremely time consuming， we briefly describe optimization strategies based on temporal， spatial， and spatiotemporal mixed momentum， which are conducive to improving training efficiency. Defense is the fundamental problem of adversarial training that is devoted to absorbing the generated adversarial examples for training via loss minimization. Therefore， we briefly review the technologies typically used in the defensive training stage， including the loss regularization term， model enhancement mechanism， parameter adaptation， early stop， and semi-supervised or unsupervised expansion strategies. To evaluate the robustness of the model， we summarize the popular datasets and typical attack methods. After sorting out relevant adversarial training technologies， we still face challenges in dealing with multi-disturbance integrated attacks and the low efficiency of the model. We put forward these problems as directions for future research on adversarial training.

关键词

深度学习对抗攻击对抗防御对抗训练鲁棒性

Keywords

deep learningadversarial attackadversarial defenseadversarial trainingrobustness

references

Addepalli S， Jain S and Babu R V. 2022. Efficient and effective augmentation strategy for adversarial training ［EB/OL］. ［2022-8-27］. https://arxiv.org/pdf/2210.15318.pdfhttps://arxiv.org/pdf/2210.15318.pdf

Andriushchenko M， Croce F， Flammarion N and Hein M. 2020. Square attack： a query-efficient black-box adversarial attack via random search//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 484-501 ［DOI： 10.1007/978-3-030-58592-1_29http://dx.doi.org/10.1007/978-3-030-58592-1_29］

Andriushchenko M and Flammarion N. 2020. Understanding and improving fast adversarial training//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 16048-16059

Bai Y， Zeng Y Y， Jiang Y， Xia S T， Ma X J and Wang Y S. 2022. Improving adversarial robustness via channel-wise activation suppressing ［EB/OL］. ［2022-01-16］. https://arxiv.org/pdf/2103.08307.pdfhttps://arxiv.org/pdf/2103.08307.pdf

Bashivan P， Bayat R， Ibrahim A， Ahuja K， Faramarzi M， Laleh T， Richards B A and Rish I. 2022. Adversarial feature desensitization ［EB/OL］. ［2022-01-04］. https://arxiv.org/pdf/2006.04621.pdfhttps://arxiv.org/pdf/2006.04621.pdf

Brown T B， Mané D， Roy A， Abadi M and Gilmer J. 2018. Adversarial patch. ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1712.09665.pdfhttps://arxiv.org/pdf/1712.09665.pdf

Cai Q Z， Liu C and Song D. 2018. Curriculum adversarial training//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm， Sweden： AAAI Press： 3740-3747

Carlini N and Wagner D. 2017. Towards evaluating the robustness of neural networks//Proceedings of 2017 IEEE Symposium on Security and Privacy （SP）. San Jose， USA： IEEE： 39-57 ［DOI： 10.1109/SP.2017.49http://dx.doi.org/10.1109/SP.2017.49］

Carmon Y， Raghunathan A， Schmidt L， Liang P and Duchi J C. 2019. Unlabeled data improves adversarial robustness//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 11192-11203

Chan A， Tay Y， Ong Y S and Fu J. 2020. Jacobian adversarially regularized networks for robustness ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1912.10185.pdfhttps://arxiv.org/pdf/1912.10185.pdf

Chen P Y， Zhang H， Sharma Y， Yi J F and Hsieh C J. 2017. ZOO： zeroth order optimization based black-box attacks to deep neural networks without training substitute models//Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. Dallas， USA： ACM： 15-26 ［DOI： 10.1145/3128572.3140448http://dx.doi.org/10.1145/3128572.3140448］

Chen T L， Liu S J， Chang S Y， Cheng Y， Amini L and Wang Z Y. 2020. Adversarial robustness： from self-supervised pre-training to fine-tuning//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 696-705 ［DOI： 10.1109/CVPR42600.2020.00078http://dx.doi.org/10.1109/CVPR42600.2020.00078］

Chen Z H， Jiang H M， Dai B and Zhao T. 2021. Learning to defense by learning to attack ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1811.01213.pdfhttps://arxiv.org/pdf/1811.01213.pdf

Cheng M H， Lei Q， Chen P Y， Dhillon I and Hsieh C J. 2020. CAT： customized adversarial training for improved robustness ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/2002.06789.pdfhttps://arxiv.org/pdf/2002.06789.pdf

Croce F and Hein M. 2020a. Minimally distorted adversarial examples with a fast adaptive boundary attack//Proceedings of the 37th International Conference on Machine Learning. ［s.l.］： JMLR.org： 2196-2205

Croce F and Hein M. 2020b. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks//Proceedings of the 37th International Conference on Machine Learning. ［s.l.］： JMLR.org： 2206-2216

Cui J Q， Liu S， Wang L W and Jia J Y. 2021. Learnable boundary guided adversarial training//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 15701-15710 ［DOI： 10.1109/ICCV48922.2021.01543http://dx.doi.org/10.1109/ICCV48922.2021.01543］

Ding G W， Sharma Y， Lui K Y C and Huang R T. 2020. Mma training： direct input space margin maximization through adversarial training ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1812.02637.pdfhttps://arxiv.org/pdf/1812.02637.pdf

Dolatabadi H M， Erfani S and Leckie C. 2020. AdvFlow： inconspicuous black-box adversarial attacks using normalizing flows//Proceedings of the 34th International Conference on Advances in Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 15871-15884

Dong Y P， Deng Z J， Pang T Y， Zhu J and Su H. 2020. Adversarial distributional training for robust deep learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 8270-8283

Dong Y P， Liao F Z， Pang T Y， Su H， Zhu J， Hu X L and Li J G. 2018. Boosting adversarial attacks with momentum//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 9185-9193 ［DOI： 10.1109/CVPR.2018.00957http://dx.doi.org/10.1109/CVPR.2018.00957］

Goodfellow I J， Pouget-Abadie J， Mirza M， Xu B， Warde-Farley D， Ozair S， Courville A and Bengio Y. 2014. Generative adversarial networks ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1406.2661.pdfhttps://arxiv.org/pdf/1406.2661.pdf

Goodfellow I J， Shlens J and Szegedy C. 2015. Explaining and harnessing adversarial examples ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1412.6572.pdfhttps://arxiv.org/pdf/1412.6572.pdf

He Z， Rakin A S and Fan D L. 2019. Parametric noise injection： trainable randomness to improve deep neural network robustness against adversarial attack//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 588-597 ［DOI： 10.1109/CVPR.2019.00068http://dx.doi.org/10.1109/CVPR.2019.00068］

Hendrycks D， Lee K and Mazeika M. 2019. Using pre-training can improve model robustness and uncertainty//Proceedings of the 36th International Conference on Machine Learning. Long Beach， USA： PMLR： 2712-2721

Jang Y， Zhao T C， Hong S and Lee H. 2019. Adversarial defense via learning to generate diverse attacks//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 2740-2749 ［DOI： 10.1109/ICCV.2019.00283http://dx.doi.org/10.1109/ICCV.2019.00283］

Jeddi A， Shafiee M J， Karg M， Scharfenberger C and Wong A. 2020. Learn2Perturb： an end-to-end feature perturbation learning to improve adversarial robustness//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 1238-1247 ［DOI： 10.1109/CVPR42600.2020.00132http://dx.doi.org/10.1109/CVPR42600.2020.00132］

Jia X J， Zhang Y， Wu B Y， Ma K， Wang J and Cao X C. 2022. LAS-AT： adversarial training with learnable attack strategy//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 13388-13398 ［DOI： 10.1109/CVPR52688.2022.01304http://dx.doi.org/10.1109/CVPR52688.2022.01304］

Kannan H， Kurakin A and Goodfellow I. 2018. Adversarial logit pairing ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1803.06373.pdfhttps://arxiv.org/pdf/1803.06373.pdf

Kariyappa S and Qureshi M K. 2019. Improving adversarial robustness of ensembles with diversity training ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1901.09981.pdfhttps://arxiv.org/pdf/1901.09981.pdf

Kim H， Lee W and Lee J. 2021. Understanding catastrophic overfitting in single-step adversarial training. Proceedings of the AAAI Conference on Artificial Intelligence， 35（9）： 8119-8127 ［DOI： 10.1609/aaai.v35i9.16989http://dx.doi.org/10.1609/aaai.v35i9.16989］

Kong R， Cai J C and Huang G. 2022. Defense to adversarial attack with generative adversarial network. Acta Automatica Sinica： 1-21

孔锐，蔡佳纯，黄钢. 2022. 基于生成对抗网络的对抗攻击防御模型. 自动化学报： 1-21 ［DOI： 10.16383/j.aas.2020.c200033http://dx.doi.org/10.16383/j.aas.2020.c200033］

Kurakin A， Goodfellow I and Bengio S. 2017. Adversarial examples in the physical world ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1607.02533.pdfhttps://arxiv.org/pdf/1607.02533.pdf

Lee S， Lee H and Yoon S. 2020. Adversarial vertex mixup： toward better adversarially robust generalization//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 269-278 ［DOI： 10.1109/CVPR42600.2020.00035http://dx.doi.org/10.1109/CVPR42600.2020.00035］

Li Q， Lin C H， Yang Y L， Shen C and Fang L M. 2022. Adversarial attacks and defenses against deep learning under the cloud-edge-terminal scenes. Journal of Computer Research and Development， 59（10）： 2109-2129

李前，蔺琛皓，杨雨龙，沈超，方黎明. 2022. 云边端全场景下深度学习模型对抗攻击和防御. 计算机研究与发展， 59（10）： 2109-2129 ［DOI： 10.7544/issn1000-1239.20220665http://dx.doi.org/10.7544/issn1000-1239.20220665］

Li Y D， Li L J， Wang L Q， Zhang T and Gong B Q. 2019. NATTACK： learning the distributions of adversarial examples for an improved black-box attack on deep neural networks ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1905.00441.pdfhttps://arxiv.org/pdf/1905.00441.pdf

Lin J D， Song C B， He K， Wang L W and Hopcroft J E. 2020. Nesterov accelerated gradient and scale invariance for adversarial attacks ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1908.06281.pdfhttps://arxiv.org/pdf/1908.06281.pdf

Liu C， Salzmann M， Lin T， Tomioka R and Süsstrunk S. 2020. On the loss landscape of adversarial training： identifying challenges and how to overcome them//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 21476-21487

Liu X M， Xie L H， Wang Y P and Li X R. 2020. Adversarial attacks and defenses in deep learning. Chinese Journal of Network and Information Security， 6（5）： 36-53

刘西蒙，谢乐辉，王耀鹏，李旭如. 2020. 深度学习中的对抗攻击与防御. 网络与信息安全学报， 6（5）： 36-53 ［DOI： 10.11959/j.issn.2096-109x.2020071http://dx.doi.org/10.11959/j.issn.2096-109x.2020071］

Liu Y P， Chen X Y， Liu C and Song D. 2017. Delving into transferable adversarial examples and black-box attacks ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1611.02770.pdfhttps://arxiv.org/pdf/1611.02770.pdf

Madry A， Makelov A， Schmidt L， Tsipras D and Vladu A. 2017. Towards deep learning models resistant to adversarial attacks ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1706.06083.pdfhttps://arxiv.org/pdf/1706.06083.pdf

Mao C Z， Zhong Z Y， Yang J F， Vondrick C and Ray B. 2019. Metric learning for adversarial robustness//Proceedings of the 33rd Conference on Neural Information Processing Systems. Virtual： NIPS： #32.

Moosavi-Dezfooli S M， Fawzi A and Frossard P. 2016. DeepFool： a simple and accurate method to fool deep neural networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 2574-2582 ［DOI： 10.1109/CVPR.2016.282http://dx.doi.org/10.1109/CVPR.2016.282］

Mustafa A， Khan S， Hayat M， Goecke R， Shen J B and Shao L. 2019. Adversarial defense by restricting the hidden space of deep neural networks//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 3384-3393 ［DOI： 10.1109/ICCV.2019.00348http://dx.doi.org/10.1109/ICCV.2019.00348］

Najafi A， Maeda S， Koyama M and Miyato T. 2019. Robustness to adversarial perturbations in learning from incomplete data//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 5541-5551

Pang T Y， Xu K， Du C， Chen N and Zhu J. 2019. Improving adversarial robustness via promoting ensemble diversity//Proceedings of the 36th International Conference on Machine Learning. Long Beach， USA： PMLR， 2019： 4970-4979

Pang T Y， Xu K and Zhu J. 2020a. Mixup inference： better exploiting mixup to defend adversarial attacks ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1909.11515.pdfhttps://arxiv.org/pdf/1909.11515.pdf

Pang T Y， Yang X， Dong Y P， Su H and Zhu J. 2021. Bag of tricks for adversarial training ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/2010.00467.pdfhttps://arxiv.org/pdf/2010.00467.pdf

Pang T Y， Yang X， Dong Y P， Xu K， Zhu J and Su H. 2020b. Boosting adversarial training with hypersphere embedding//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 7779-7792

Papernot N， McDaniel P， Goodfellow I， Jha S， Celik Z B and Swami A. 2017. Practical black-box attacks against machine learning//Proceedings of 2017 ACM on Asia Conference on Computer and Communications Security. Abu Dhabi， United Arab Emirates： ACM： 506-519 ［DOI： 10.1145/3052973.3053009http://dx.doi.org/10.1145/3052973.3053009］

Qian S C， Wen Y H， Ma Y F and Mao X W. 2022. Adversial sample attack and defense methods based on deep neural networks. Cyberspace Security， 13（5）： 77-86

钱申诚，文宇恒，马耀飞，毛鑫唯. 2022. 基于深度神经网络的对抗样本攻击与防御方法研究. 网络空间安全， 13（5）： 77-86

Qin C L， Martens J， Gowal S， Krishnan D， Dvijotham K， Fawzi A， De S， Stanforth R and Kohli P. 2019. Adversarial robustness through local linearization//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 13842-13853

Rice L， Wong E and Kolter J Z. 2020. Overfitting in adversarially robust deep learning//Proceedings of the 37th International Conference on Machine Learning. ［s.l.］： JMLR.org： 8093-8104

Salman H， Ilyas A， Engstrom L， Vemprala S， Madry A and Kapoor A. 2021. Unadversarial examples： designing objects for robust vision//Proceedings of the 35th International Conference on Neural Information Processing Systems. ［s.l.］： Curran Associates， Inc.： 15270-15284

Shafahi A， Najibi M， Ghiasi A， Xu Z， Dickerson J， Studer C， Davis L S， Taylor G and Goldstein T. 2019. Adversarial training for free!//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 3358-3369

Sharif M， Bhagavatula S， Bauer L and Reiter M K. 2016. Accessorize to a crime： real and stealthy attacks on state-of-the-art face recognition//Proceedings of 2016 ACM SIGSAC Conference on Computer and Communications Security. Vienna， Austria： ACM： 1528-1540 ［DOI： 10.1145/2976749.2978392http://dx.doi.org/10.1145/2976749.2978392］

Singh M， Sinha A， Kumari N， Machiraju H， Krishnamurthy B and Balasubramanian V N. 2019. Harnessing the vulnerability of latent layers in adversarially trained models ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1905.05186.pdfhttps://arxiv.org/pdf/1905.05186.pdf

Singla V， Singla S， Feizi S and Jacobs D. 2021. Low curvature activations reduce overfitting in adversarial training//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 16403-16413 ［DOI： 10.1109/ICCV48922.2021.01611http://dx.doi.org/10.1109/ICCV48922.2021.01611］

Sitawarin C， Chakraborty S and Wagner D. 2021. Improving adversarial robustness through progressive hardening ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/2003.09347.pdfhttps://arxiv.org/pdf/2003.09347.pdf

Song C B， He K， Lin J D， Wang L W and Hopcroft J E. 2020. Robust local features for improving the generalization of adversarial training ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1909.10147.pdfhttps://arxiv.org/pdf/1909.10147.pdf

Song C B， He K， Wang L W and Hopcroft J E. 2019. Improving the generalization of adversarial training with domain adaptation ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1810.00740.pdfhttps://arxiv.org/pdf/1810.00740.pdf

Sriramanan G， Addepalli S， Baburaj A and Babu R V. 2020. Guided adversarial attack for evaluating and enhancing adversarial defenses//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 20297-20308

Tramèr F and Boneh D. 2019. Adversarial training and robustness for multiple perturbations//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 5866-5876

Tramèr F， Kurakin A， Papernot N， Goodfellow I， Boneh D and McDaniel P. 2020. Ensemble adversarial training： attacks and defenses ［2022-10-27］. https：//arxiv.org/pdf/1705.07204.pdfhttps://arxiv.org/pdf/1705.07204.pdf

Tsipras D， Santurkar S， Engstrom L， Turner A and Madry A. 2019. Robustness may be at odds with accuracy ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1805.12152.pdfhttps://arxiv.org/pdf/1805.12152.pdf

Uesato J， Alayrac J B， Huang P S， Stanforth R， Fawzi A and Kohli P. 2019. Are labels required for improving adversarial robustness?//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 12214-12223

Uesato J， O’Donoghue B， Kohli P and Oord A. 2018. Adversarial risk and the dangers of evaluating against weak attacks//Proceedings of the 35th International Conference on Machine Learning. Stockholm， Sweden： PMLR： 5025-5034

Vivek B S and Babu R V. 2020. Single-step adversarial training with dropout scheduling//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 947-956 ［DOI： 10.1109/CVPR42600.2020.00103http://dx.doi.org/10.1109/CVPR42600.2020.00103］

Wan W T， Chen J S and Yang M H. 2020. Adversarial training with bi-directional likelihood regularization for visual classification//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 785-800 ［DOI： 10.1007/978-3-030-58586-0_46http://dx.doi.org/10.1007/978-3-030-58586-0_46］

Wang G Q， Wei X X and Yan H Q. 2022a. Improving adversarial transferability with spatial momentum ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/2203.13479.pdfhttps://arxiv.org/pdf/2203.13479.pdf

Wang H T， Chen T L， Gui S P， Hu T K， Liu J and Wang Z Y. 2020. Once-for-all adversarial training： in-situ tradeoff between robustness and accuracy for free//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 7449-7461

Wang J K， Yin Z X， Hu P F， Liu A S， Tao R S， Qin H T， Liu X L and Tao D C. 2022b. Defensive patches for robust recognition in the physical world//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 2446-2455 ［DOI： 10.1109/CVPR52688.2022.00249http://dx.doi.org/10.1109/CVPR52688.2022.00249］

Wang J Y and Zhang H C. 2019. Bilateral adversarial training： towards fast training of more robust models against adversarial attacks//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 6628-6637 ［DOI： 10.1109/ICCV.2019.00673http://dx.doi.org/10.1109/ICCV.2019.00673］

Wang X S and He K. 2021. Enhancing the transferability of adversarial attacks through variance tuning//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 1924-1933 ［DOI： 10.1109/CVPR46437.2021.00196http://dx.doi.org/10.1109/CVPR46437.2021.00196］

Wang Y S， Zou D F， Yi J F， Bailey J， Ma X J and Gu Q Q. 2019. Improving adversarial robustness requires revisiting misclassified examples//Proceedings of 2019 International Conference on Learning Representations. openreview

Wong E， Rice L and Kolter J Z. 2020. Fast is better than free： revisiting adversarial training ［EB/OL］. ［2020-01-12］. https://arxiv.org/pdf/2001.03994.pdfhttps://arxiv.org/pdf/2001.03994.pdf

Wu D X， Xia S T and Wang Y S. 2020a. Adversarial weight perturbation helps robust generalization//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 2958-2969

Wu W B， Su Y X， Lyu M R and King I. 2021. Improving the transferability of adversarial samples with adversarial transformations//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 9020-9029 ［DOI： 10.1109/CVPR46437.2021.00891http://dx.doi.org/10.1109/CVPR46437.2021.00891］

Wu Y T， Liu W， Yu H T and Cao X C. 2022. Adversarial attacks on graph neural network based on local influence analysis model. Journal of Electronics and Information Technology， 44（7）： 2576-2583

吴翼腾，刘伟，于洪涛，操晓春. 2022. 基于局部影响分析模型的图神经网络对抗攻击. 电子与信息学报， 44（7）： 2576-2583 ［DOI： 10.11999/JEIT210448http://dx.doi.org/10.11999/JEIT210448］

Wu Z X， Lim S N， Davis L S and Goldstein T. 2020b. Making an invisibility cloak： real world adversarial attacks on object detectors//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 1-17 ［DOI： 10.1007/978-3-030-58548-8_1http://dx.doi.org/10.1007/978-3-030-58548-8_1］

Xiao C and Zheng C X. 2020. One man’s trash is another man’s treasure： resisting adversarial examples by adversarial examples//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 408-418 ［DOI： 10.1109/CVPR42600.2020.00049http://dx.doi.org/10.1109/CVPR42600.2020.00049］

Xie C H， Tan M X， Gong B Q， Wang J， Yuille A L and Le Q V. 2020. Adversarial examples improve image recognition//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 816-825 ［DOI： 10.1109/CVPR42600.2020.00090http://dx.doi.org/10.1109/CVPR42600.2020.00090］

Xie C H， Wang J Y， Zhang Z S， Ren Z and Yuille A. 2018. Mitigating adversarial effects through randomization ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/1711.01991.pdfhttps://arxiv.org/pdf/1711.01991.pdf

Xie C H， Wu Y X， van der Maaten L， Yuille A L and He K M. 2019. Feature denoising for improving adversarial robustness//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 501-509 ［DOI： 10.1109/CVPR.2019.00059http://dx.doi.org/10.1109/CVPR.2019.00059］

Xiong Y H and Hsieh C J. 2020. Improved adversarial training via learned optimizer//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 85-100 ［DOI： 10.1007/978-3-030-58598-3_6http://dx.doi.org/10.1007/978-3-030-58598-3_6］

Xu H， Liu X R， Li Y X， Jain A and Tang J L. 2021. To be robust or to be fair： towards fairness in adversarial training//Proceedings of the 38th International Conference on Machine Learning. ［s.l.］： PMLR： 11492-11501

Xu Z， Shafahi A and Goldstein T. 2020. Exploring model robustness with adaptive networks and improved adversarial training ［EB/OL］. ［2022-10-27］. https://arxiv.org/pdf/2006.00387.pdfhttps://arxiv.org/pdf/2006.00387.pdf

Ye N Y， Li Q X， Zhou X Y and Zhu Z X. 2021. Amata： an annealing mechanism for adversarial training acceleration. Proceedings of the AAAI Conference on Artificial Intelligence， 35（12）： 10691-10699 ［DOI： 10.1609/aaai.v35i12.17278http://dx.doi.org/10.1609/aaai.v35i12.17278］

Yu Y R， Gao X T and Xu C Z. 2021. LAFEAT： piercing through adversarial defenses with latent features//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 5731-5741 ［DOI： 10.1109/CVPR46437.2021.00568http://dx.doi.org/10.1109/CVPR46437.2021.00568］

Yu Z F， Yan Q and Zhou Y. 2022. A survey on adversarial machine learning for cyberspace defense. Acta Automatica Sinica， 48（7）： 1625-1649

余正飞，闫巧，周鋆. 2022. 面向网络空间防御的对抗机器学习研究综述. 自动化学报， 48（7）： 1625-1649 ［DOI： 10.16383/j.aas.c210089http://dx.doi.org/10.16383/j.aas.c210089］

Yuan L， Li X M， Pan Z X， Sun J M and Xiao L. 2022. Review of adversarial examples for object detection. Journal of Image and Graphics， 27（10）： 2873-2896

袁珑，李秀梅，潘振雄，孙军梅，肖蕾. 2022. 面向目标检测的对抗样本综述. 中国图象图形学报， 27（10）： 2873-2896 ［DOI： 10.11834/jig.210209http://dx.doi.org/10.11834/jig.210209］

Zhai R T， Cai T L， He D， Dan C， He K， Hopcroft J and Wang L W. 2019. Adversarially robust generalization just requires more unlabeled data ［EB/OL］. ［2019-09-26］. https://arxiv.org/pdf/1906.00555.pdfhttps://arxiv.org/pdf/1906.00555.pdf

Zhang D H， Zhang T Y， Lu Y P， Zhu Z X and Dong B. 2019a. You only propagate once： accelerating adversarial training via maximal principle//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 227-238

Zhang H C and Wang J Y. 2019. Defense against adversarial attacks using feature scattering-based adversarial training//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 1831-1841

Zhang H C and Xu W. 2020. Adversarial interpolation training： a simple approach for improving model robustness.//Proceedings of 2020 International Conference on Learning Representations. Virtual： ICLR

Zhang H Y， Yu Y D， Jiao J T， Xing E， El Ghaoui L and Jordan M. 2019b. Theoretically principled trade-off between robustness and accuracy//Proceedings of the 36th International Conference on Machine Learning. Long Beach， USA： PMLR： 7472-7482

Zhang J F， Xu X L， Han B， Niu G， Cui L Z， Sugiyama M and Kankanhalli M. 2020. Attacks which do not kill training make adversarial learning stronger//Proceedings of the 37th International Conference on Machine Learning. ［s.l.］： JMLR.org， 2020： 11258-11287

Zhang J P， Wu W B， Huang J T， Huang Y Z， Wang W X， Su Y X and Lyu M R. 2022. Improving adversarial transferability via neuron attribution-based attacks//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 14973-14982 ［DOI： 10.1109/CVPR52688.2022.01457http://dx.doi.org/10.1109/CVPR52688.2022.01457］

Zhao H， Chang Y K and Wang W J. 2022. Survey of adversarial attacks and defense methods for deep neural networks. Computer Science， 49（S2）： #210900163

赵宏，常有康，王伟杰. 2022. 深度神经网络的对抗攻击及防御方法综述. 计算机科学， 49（S2）： #210900163 ［DOI： 10.11896/jsjkx.210900163http://dx.doi.org/10.11896/jsjkx.210900163］

Zheng H Z， Zhang Z Q， Gu J C， Lee H and Prakash A. 2020. Efficient adversarial training with transferable adversarial examples//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 1178-1187 ［DOI： 10.1109/CVPR42600.2020.00126http://dx.doi.org/10.1109/CVPR42600.2020.00126］

文章被引用时，请邮件提醒。

提交

针对未知攻击的泛化性对抗防御技术综述

面向高光谱图像分类网络的对比半监督对抗训练方法

人脸深度伪造主动防御技术综述

面向目标检测的对抗样本综述

数字人脸渲染与外观恢复方法综述