Fisher Information guided Purification against Backdoor Attacks

Nazmul Karim, Abdullah Al Arafat, Adnan Siraj Rakin, Zhishan Guo, Nazanin Rahnavard

Type: Preprint

Publication Date: 2024-09-01

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2409.00863

Abstract

Studies on backdoor attacks in recent years suggest that an adversary can compromise the integrity of a deep neural network (DNN) by manipulating a small set of training samples. Our analysis shows that such manipulation can make the backdoor model converge to a bad local minima, i.e., sharper minima as compared to a benign model. Intuitively, the backdoor can be purified by re-optimizing the model to smoother minima. However, a na\"ive adoption of any optimization targeting smoother minima can lead to sub-optimal purification techniques hampering the clean test accuracy. Hence, to effectively obtain such re-optimization, inspired by our novel perspective establishing the connection between backdoor removal and loss smoothness, we propose Fisher Information guided Purification (FIP), a novel backdoor purification framework. Proposed FIP consists of a couple of novel regularizers that aid the model in suppressing the backdoor effects and retaining the acquired knowledge of clean data distribution throughout the backdoor removal procedure through exploiting the knowledge of Fisher Information Matrix (FIM). In addition, we introduce an efficient variant of FIP, dubbed as Fast FIP, which reduces the number of tunable parameters significantly and obtains an impressive runtime gain of almost $5\times$. Extensive experiments show that the proposed method achieves state-of-the-art (SOTA) performance on a wide range of backdoor defense benchmarks: 5 different tasks -- Image Recognition, Object Detection, Video Action Recognition, 3D point Cloud, Language Generation; 11 different datasets including ImageNet, PASCAL VOC, UCF101; diverse model architectures spanning both CNN and vision transformer; 14 different backdoor attacks, e.g., Dynamic, WaNet, LIRA, ISSBA, etc.

Locations

arXiv (Cornell University) - View - PDF

Similar Works

Action	Title	Year	Authors
+	Fisher Information guided Purification against Backdoor Attacks	2024	Nazmul Karim Abdullah Al Arafat Adnan Siraj Rakin Zhishan Guo Nazanin Rahnavard
+	Efficient Backdoor Removal Through Natural Gradient Fine-tuning	2023	Nazmul Karim Abdullah Al Arafat Umar Khalid Zhishan Guo Naznin Rahnavard
+	Single Image Backdoor Inversion via Robust Smoothed Classifiers	2023	Ming-Jie Sun Zico Kolter
+ PDF Chat	Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models	2024	Yige Li Hanxun Huang Jiaming Zhang Xingjun Ma Yu–Gang Jiang
+	Backdoor Cleansing with Unlabeled Data	2022	Lu Pang Tong Sun Haibin Ling Chao Chen
+	Bypassing Backdoor Detection Algorithms in Deep Learning	2019	Te Juin Lester Tan Reza Shokri
+	Bypassing Backdoor Detection Algorithms in Deep Learning	2019	Te Juin Lester Tan Reza Shokri
+ PDF Chat	Bypassing Backdoor Detection Algorithms in Deep Learning	2020	Te Juin Lester Tan Reza Shokri
+	Backdoor Learning: A Survey	2020	Yiming Li Yong Jiang Zhifeng Li Shu‐Tao Xia
+	Sniper Backdoor: Single Client Targeted Backdoor Attack in Federated Learning	2022	Gorka Abad Servio Paguada Stjepan Picek Víctor Julio Ramírez-Durán Aitor Urbieta
+ PDF Chat	Beating Backdoor Attack at Its Own Game	2023	Min Liu Alberto Sangiovanni‐Vincentelli Xiangyu Yue
+	Beating Backdoor Attack at Its Own Game	2023	Min Liu Alberto Sangiovanni‐Vincentelli Xiangyu Yue
+	Sniper Backdoor: Single Client Targeted Backdoor Attack in Federated Learning	2023	Gorka Abad Servio Paguada Oğuzhan Ersoy Stjepan Picek Víctor Julio Ramírez-Durán Aitor Urbieta
+ PDF Chat	Anti-Backdoor Learning: Training Clean Models on Poisoned Data	2021	Yige Li Xixiang Lyu Nodens Koren Lingjuan Lyu Bo Li Xingjun Ma
+	Anti-Backdoor Learning: Training Clean Models on Poisoned Data	2021	Yige Li Xixiang Lyu Nodens Koren Lingjuan Lyu Bo Li Xingjun Ma
+ PDF Chat	Backdoor Cleansing with Unlabeled Data	2023	Lu Pang Tao Sun Haibin Ling Chao Chen
+ PDF Chat	BAN: Detecting Backdoors Activated by Adversarial Neuron Noise	2024	Xiaoyun Xu Zhuoran Liu Stefanos Koffas Shujian Yu Stjepan Picek
+	Backdoor Defense via Decoupling the Training Process	2022	Kunzhe Huang Yiming Li Baoyuan Wu Zhan Qin Kui Ren
+ PDF Chat	PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning	2024	Yukai Xu Yujie Gu Kouichi Sakurai
+ PDF Chat	Backdoor Learning: A Survey	2022	Yiming Li Yong Jiang Zhifeng Li Shu‐Tao Xia

Works That Cite This (0)

Action	Title	Year	Authors

Works Cited by This (0)

Action	Title	Year	Authors