Cassandra: Detecting Trojaned Networks From Adversarial Perturbations

Type: Article

Publication Date: 2021-01-01

Citations: 14

DOI: https://doi.org/10.1109/access.2021.3101289

Abstract

Deep neural networks are being widely deployed for critical tasks. In many cases, pre-trained models are sourced from vendors who may have disrupted the training pipeline to insert Trojan behaviors. These malicious behaviors can be triggered at the adversary's will, which is a serious security threat. To verify the integrity of a deep model, we propose a method that captures its fingerprint with adversarial perturbations. Inserting backdoors into a network alters its decision boundaries which are effectively encoded by adversarial perturbations. Our proposed Trojan detection network learns features from adversarial patterns and its properties to encode the unknown trigger shape and deviations in the decision boundaries caused by backdoors. Our method works completely without or with limited clean samples for improved performance. Our method also performs anomaly detection to identify the target class of a Trojaned network and is invariant to the trigger type, trigger size, network architecture and does not require any triggered samples. Experiments are performed on MNIST, NIST-TrojAI and Odysseus datasets, with 5000 pre-trained models in total, making this the largest study to date on Trojaned detection and the new state-of-the-art accuracy is achieved.

Locations

  • IEEE Access - View - PDF
  • arXiv (Cornell University) - View - PDF
  • DOAJ (DOAJ: Directory of Open Access Journals) - View

Similar Works

Action Title Year Authors
+ Cassandra: Detecting Trojaned Networks from Adversarial Perturbations 2020 Xiaoyu Zhang
Ajmal Mian
Rohit Gupta
Nazanin Rahnavard
Mubarak Shah
+ Detecting Trojaned DNNs Using Counterfactual Attributions 2020 Karan Sikka
Indranil Sur
Susmit Jha
Anirban Roy
Ajay Divakaran
+ PDF Chat Odyssey: Creation, Analysis and Detection of Trojan Models 2021 Marzieh Edraki
Nazmul Karim
Nazanin Rahnavard
Ajmal Mian
Mubarak Shah
+ Odyssey: Creation, Analysis and Detection of Trojan Models 2020 Marzieh Edraki
Nazmul Karim
Nazanin Rahnavard
Ajmal Mian
Mubarak Shah
+ PDF Chat MDTD: A Multi-Domain Trojan Detector for Deep Neural Networks 2023 Arezoo Rajabi
Surudhi Asokraj
Fengqing Jiang
Luyao Niu
Bhaskar Ramasubramanian
James A. Ritcey
Radha Poovendran
+ PDF Chat A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation 2022 Hao Fu
Akshaj Kumar Veldanda
P. Krishnamurthy
Siddharth Garg
Farshad Khorrami
+ FreeEagle: Detecting Complex Neural Trojans in Data-Free Cases 2023 Chong Fu
Xuhong Zhang
Shouling Ji
Ting Wang
Peng Lin
Yanghe Feng
Jianwei Yin
+ PDF Chat Detecting Trojaned DNNs Using Counterfactual Attributions 2023 Karan Sikka
Indranil Sur
Anirban Roy
Ajay Divakaran
Susmit Jha
+ Detecting Backdoors in Neural Networks Using Novel Feature-Based Anomaly Detection. 2020 Hao Fu
Akshaj Kumar Veldanda
P. Krishnamurthy
Siddharth Garg
Farshad Khorrami
+ PDF Chat Scanning Trojaned Models Using Out-of-Distribution Samples 2025 Hossein Mirzaei
Ali Ansari
Bahar Dibaei Nia
Mojtaba Nafez
Moein Madadi
Sepehr Rezaee
Zeinab Taghavi
Amjad Maleki
Kian Shamsaie
Mahdi Hajialilue
+ Odyssey: Creation, Analysis and Detection of Trojan Models 2020 Marzieh Edraki
Nazmul Karim
Nazanin Rahnavard
Ajmal Mian
Mubarak Shah
+ MDTD: A Multi Domain Trojan Detector for Deep Neural Networks 2023 Arezoo Rajabi
Surudhi Asokraj
Fengqing Jiang
Luyao Niu
Bhaskar Ramasubramanian
Jim Ritcey
Radha Poovendran
+ An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks 2020 Ruixiang Tang
Mengnan Du
Ninghao Liu
Fan Yang
Xia Hu
+ An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks 2020 Ruixiang Tang
Mengnan Du
Ninghao Liu
Fan Yang
Xia Hu
+ Scalable Backdoor Detection in Neural Networks 2020 Haripriya Harikumar
Vuong Le
Santu Rana
Sourangshu Bhattacharya
Sunil Gupta
Svetha Venkatesh
+ Scalable Backdoor Detection in Neural Networks 2020 Haripriya Harikumar
Vuong Le
Santu Rana
Sourangshu Bhattacharya
Sunil Gupta
Svetha Venkatesh
+ Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems 2020 Bao Gia Doan
Ehsan Abbasnejad
Damith C. Ranasinghe
+ Backdoor Mitigation by Correcting the Distribution of Neural Activations 2023 Xi Li
Zhen Xiang
David J. Miller
George Kesidis
+ PDF Chat Can We Mitigate Backdoor Attack Using Adversarial Detection Methods? 2022 Kaidi Jin
Tianwei Zhang
Chao Shen
Yufei Chen
Ming Fan
Chenhao Lin
Ting Liu
+ Can We Mitigate Backdoor Attack Using Adversarial Detection Methods? 2020 Kaidi Jin
Tianwei Zhang
Chao Shen
Yufei Chen
Ming Fan
Chenhao Lin
Ting Liu