Fisher Information guided Purification against Backdoor Attacks

Type: Preprint

Publication Date: 2024-09-01

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2409.00863

Abstract

Studies on backdoor attacks in recent years suggest that an adversary can compromise the integrity of a deep neural network (DNN) by manipulating a small set of training samples. Our analysis shows that such manipulation can make the backdoor model converge to a bad local minima, i.e., sharper minima as compared to a benign model. Intuitively, the backdoor can be purified by re-optimizing the model to smoother minima. However, a na\"ive adoption of any optimization targeting smoother minima can lead to sub-optimal purification techniques hampering the clean test accuracy. Hence, to effectively obtain such re-optimization, inspired by our novel perspective establishing the connection between backdoor removal and loss smoothness, we propose Fisher Information guided Purification (FIP), a novel backdoor purification framework. Proposed FIP consists of a couple of novel regularizers that aid the model in suppressing the backdoor effects and retaining the acquired knowledge of clean data distribution throughout the backdoor removal procedure through exploiting the knowledge of Fisher Information Matrix (FIM). In addition, we introduce an efficient variant of FIP, dubbed as Fast FIP, which reduces the number of tunable parameters significantly and obtains an impressive runtime gain of almost $5\times$. Extensive experiments show that the proposed method achieves state-of-the-art (SOTA) performance on a wide range of backdoor defense benchmarks: 5 different tasks -- Image Recognition, Object Detection, Video Action Recognition, 3D point Cloud, Language Generation; 11 different datasets including ImageNet, PASCAL VOC, UCF101; diverse model architectures spanning both CNN and vision transformer; 14 different backdoor attacks, e.g., Dynamic, WaNet, LIRA, ISSBA, etc.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Fisher Information guided Purification against Backdoor Attacks 2024 Nazmul Karim
Abdullah Al Arafat
Adnan Siraj Rakin
Zhishan Guo
Nazanin Rahnavard
+ Efficient Backdoor Removal Through Natural Gradient Fine-tuning 2023 Nazmul Karim
Abdullah Al Arafat
Umar Khalid
Zhishan Guo
Naznin Rahnavard
+ Single Image Backdoor Inversion via Robust Smoothed Classifiers 2023 Ming-Jie Sun
Zico Kolter
+ PDF Chat Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models 2024 Yige Li
Hanxun Huang
Jiaming Zhang
Xingjun Ma
Yu–Gang Jiang
+ Backdoor Cleansing with Unlabeled Data 2022 Lu Pang
Tong Sun
Haibin Ling
Chao Chen
+ Bypassing Backdoor Detection Algorithms in Deep Learning 2019 Te Juin Lester Tan
Reza Shokri
+ Bypassing Backdoor Detection Algorithms in Deep Learning 2019 Te Juin Lester Tan
Reza Shokri
+ PDF Chat Bypassing Backdoor Detection Algorithms in Deep Learning 2020 Te Juin Lester Tan
Reza Shokri
+ Backdoor Learning: A Survey 2020 Yiming Li
Yong Jiang
Zhifeng Li
Shu‐Tao Xia
+ Sniper Backdoor: Single Client Targeted Backdoor Attack in Federated Learning 2022 Gorka Abad
Servio Paguada
Stjepan Picek
VĂ­ctor Julio RamĂ­rez-DurĂĄn
Aitor Urbieta
+ PDF Chat Beating Backdoor Attack at Its Own Game 2023 Min Liu
Alberto Sangiovanni‐Vincentelli
Xiangyu Yue
+ Beating Backdoor Attack at Its Own Game 2023 Min Liu
Alberto Sangiovanni‐Vincentelli
Xiangyu Yue
+ Sniper Backdoor: Single Client Targeted Backdoor Attack in Federated Learning 2023 Gorka Abad
Servio Paguada
Oğuzhan Ersoy
Stjepan Picek
VĂ­ctor Julio RamĂ­rez-DurĂĄn
Aitor Urbieta
+ PDF Chat Anti-Backdoor Learning: Training Clean Models on Poisoned Data 2021 Yige Li
Xixiang Lyu
Nodens Koren
Lingjuan Lyu
Bo Li
Xingjun Ma
+ Anti-Backdoor Learning: Training Clean Models on Poisoned Data 2021 Yige Li
Xixiang Lyu
Nodens Koren
Lingjuan Lyu
Bo Li
Xingjun Ma
+ PDF Chat Backdoor Cleansing with Unlabeled Data 2023 Lu Pang
Tao Sun
Haibin Ling
Chao Chen
+ PDF Chat BAN: Detecting Backdoors Activated by Adversarial Neuron Noise 2024 Xiaoyun Xu
Zhuoran Liu
Stefanos Koffas
Shujian Yu
Stjepan Picek
+ Backdoor Defense via Decoupling the Training Process 2022 Kunzhe Huang
Yiming Li
Baoyuan Wu
Zhan Qin
Kui Ren
+ PDF Chat PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning 2024 Yukai Xu
Yujie Gu
Kouichi Sakurai
+ PDF Chat Backdoor Learning: A Survey 2022 Yiming Li
Yong Jiang
Zhifeng Li
Shu‐Tao Xia

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors