Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training

Type: Preprint

Publication Date: 2024-11-05

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2411.02871

Abstract

Despite remarkable achievements in deep learning across various domains, its inherent vulnerability to adversarial examples still remains a critical concern for practical deployment. Adversarial training has emerged as one of the most effective defensive techniques for improving model robustness against such malicious inputs. However, existing adversarial training schemes often lead to limited generalization ability against underlying adversaries with diversity due to their overreliance on a point-by-point augmentation strategy by mapping each clean example to its adversarial counterpart during training. In addition, adversarial examples can induce significant disruptions in the statistical information w.r.t. the target model, thereby introducing substantial uncertainty and challenges to modeling the distribution of adversarial examples. To circumvent these issues, in this paper, we propose a novel uncertainty-aware distributional adversarial training method, which enforces adversary modeling by leveraging both the statistical information of adversarial examples and its corresponding uncertainty estimation, with the goal of augmenting the diversity of adversaries. Considering the potentially negative impact induced by aligning adversaries to misclassified clean examples, we also refine the alignment reference based on the statistical proximity to clean examples during adversarial training, thereby reframing adversarial training within a distribution-to-distribution matching framework interacted between the clean and adversarial domains. Furthermore, we design an introspective gradient alignment approach via matching input gradients between these domains without introducing external models. Extensive experiments across four benchmark datasets and various network architectures demonstrate that our approach achieves state-of-the-art adversarial robustness and maintains natural performance.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Adversarial Distributional Training for Robust Deep Learning 2020 Yinpeng Dong
Zhijie Deng
Tianyu Pang
Hang Su
Jun Zhu
+ Adversarial Distributional Training for Robust Deep Learning 2020 Yinpeng Dong
Zhijie Deng
Tianyu Pang
Jun Zhu
Hang Su
+ A Unified Wasserstein Distributional Robustness Framework for Adversarial Training 2022 Tuan Anh Bui
Trung Le
Quan Tran
He Zhao
Dinh Phung
+ Recent Advances in Adversarial Training for Adversarial Robustness 2021 Tao Bai
Jinqi Luo
Jun Zhao
Bihan Wen
Qian Wang
+ Modeling Adversarial Noise for Adversarial Training 2021 Dawei Zhou
Nannan Wang
Bo Han
Tongliang Liu
+ PDF Chat Improving Adversarial Training using Vulnerability-Aware Perturbation Budget 2024 Olukorede Fakorede
Modeste Atsague
Jin Tian
+ Provable Unrestricted Adversarial Training without Compromise with Generalizability 2023 Lilin Zhang
Ning Yang
Yanchao Sun
Philip S. Yu
+ Generating Less Certain Adversarial Examples Improves Robust Generalization 2023 Minxing Zhang
Michael Backes
Zhang Xiao
+ Guided Interpolation for Adversarial Training 2021 Chen Chen
Jingfeng Zhang
Xilie Xu
Tianlei Hu
Gang Niu
Gang Chen
Masashi Sugiyama
+ Regional Adversarial Training for Better Robust Generalization 2021 Chuanbiao Song
Yanbo Fan
Yicheng Yang
Baoyuan Wu
Yiming Li
Zhifeng Li
Kun He
+ PDF Chat Adversarial Training in Low-Label Regimes with Margin-Based Interpolation 2024 Tian Ye
Rajgopal Kannan
Viktor Prasanna
+ Recent Advances in Adversarial Training for Adversarial Robustness 2021 Tao Bai
Jinqi Luo
Jun Zhao
Bihan Wen
Qian Wang
+ Omnipotent Adversarial Training in the Wild 2023 Guanlin Li
Kangjie Chen
Yuan Xu
Han Qiu
Tianwei Zhang
+ Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness 2020 L. Zhao
Ting Liu
Xi Peng
Dimitris Metaxas
+ Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness 2020 L. Zhao
Ting Liu
Xi Peng
Dimitris Metaxas
+ Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing 2023 Lin Li
Michael Spratling
+ PDF Chat DART: A Principled Approach to Adversarially Robust Unsupervised Domain Adaptation 2024 Yunjuan Wang
Hussein Hazimeh
Natalia Ponomareva
Alexey Kurakin
Ibrahim Hammoud
Raman Arora
+ PDF Chat Toward Understanding and Boosting Adversarial Transferability From a Distribution Perspective 2022 Yao Zhu
Yuefeng Chen
Xiaodan Li
Kejiang Chen
Yuan He
Xiang Tian
Bolun Zheng
Yaowu Chen
Qingming Huang
+ Beneficial Perturbations Network for Defending Adversarial Examples 2020 Shixian Wen
Laurent Itti
+ Domain Invariant Adversarial Learning 2021 Matan Levi
Idan Attias
Aryeh Kontorovich

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors