Adversarial Perturbations Fool Deepfake Detectors

Type: Article

Publication Date: 2020-07-01

Citations: 101

DOI: https://doi.org/10.1109/ijcnn48605.2020.9207034

Abstract

This work uses adversarial perturbations to enhance deepfake images and fool common deepfake detectors. We created adversarial perturbations using the Fast Gradient Sign Method and the Carlini and Wagner L <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> norm attack in both blackbox and whitebox settings. Detectors achieved over 95% accuracy on unperturbed deepfakes, but less than 27% accuracy on perturbed deepfakes. We also explore two improvements to deep-fake detectors: (i) Lipschitz regularization, and (ii) Deep Image Prior (DIP). Lipschitz regularization constrains the gradient of the detector with respect to the input in order to increase robustness to input perturbations. The DIP defense removes perturbations using generative convolutional neural networks in an unsupervised manner. Regularization improved the detection of perturbed deepfakes on average, including a 10% accuracy boost in the blackbox case. The DIP defense achieved 95% accuracy on perturbed deepfakes that fooled the original detector while retaining 98% accuracy in other cases on a 100 image subsample.

Locations

  • arXiv (Cornell University) - View - PDF
  • 2022 International Joint Conference on Neural Networks (IJCNN) - View

Similar Works

Action Title Year Authors
+ Adversarial Perturbations Fool Deepfake Detectors 2020 Apurva Gandhi
Shomik Jain
+ Adversarial Perturbations Fool Deepfake Detectors 2020 Apurva Gandhi
Shomik Jain
+ Imperceptible Adversarial Examples for Fake Image Detection 2021 Quanyu Liao
Yuezun Li
Xin Wang
Bin Kong
Bin Zhu
Siwei Lyu
Youbing Yin
Qi Song
Xi Wu
+ PDF Chat Adversarial Threats to DeepFake Detection: A Practical Perspective 2021 Paarth Neekhara
Brian Dolhansky
Joanna Bitton
Cristian Canton Ferrer
+ On Detecting Adversarial Perturbations 2016 Jan Hendrik Metzen
Tim Genewein
Volker Fischer
Bastian Bischoff
+ On Detecting Adversarial Perturbations 2017 Jan Hendrik Metzen
Tim Genewein
Volker Fischer
Bastian Bischoff
+ PDF Chat Imperceptible Adversarial Examples For Fake Image Detection 2021 Quanyu Liao
Yuezun Li
Xin Wang
Bin Kong
Bin Zhu
Siwei Lyu
Youbing Yin
Qi Song
Xi Wu
+ Simple Black-Box Adversarial Perturbations for Deep Networks 2016 Nina Narodytska
Shiva Prasad Kasiviswanathan
+ One Sparse Perturbation to Fool them All, almost Always! 2020 Arka Ghosh
Sankha Subhra Mullick
Shounak Datta
Swagatam Das
Rammohan Mallipeddi
Asit Kumar Das
+ Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey 2018 Naveed Akhtar
Ajmal Mian
+ A New Kind of Adversarial Example 2022 Ali Borji
+ FDA: Feature Disruptive Attack 2019 Aditya Ganeshan
B S Vivek
R. Venkatesh Babu
+ FDA: Feature Disruptive Attack 2019 Aditya Ganeshan
B. S. Vivek
R. Venkatesh Babu
+ Mitigating Adversarial Attacks in Deepfake Detection: An Exploration of Perturbation and AI Techniques 2023 Saminder Dhesi
Laura Fontes
Pedro Machado
Isibor Kennedy Ihianle
Farhad Fassihi Tash
David Ada Adama
+ Adversarial Threats to DeepFake Detection: A Practical Perspective 2020 Paarth Neekhara
Brian Dolhansky
Joanna Bitton
Cristian Canton Ferrer
+ Adversarial Threats to DeepFake Detection: A Practical Perspective 2020 Paarth Neekhara
Brian Dolhansky
Joanna Bitton
Cristian Canton Ferrer
+ Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey 2018 Naveed Akhtar
Ajmal Mian
+ PDF Chat Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks 2020 Nilaksh Das
Haekyu Park
Zijie J. Wang
Fred Hohman
Robert Firstman
Emily Rogers
Duen Horng Chau
+ Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks 2020 Nilaksh Das
Haekyu Park
Zijie J. Wang
Fred Hohman
Robert Firstman
Emily Rogers
Duen Horng Chau
+ Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks 2020 Nilaksh Das
Haekyu Park
Zijie J. Wang
Fred Hohman
Robert Firstman
Emily Rogers
Duen Horng Chau

Works That Cite This (26)

Action Title Year Authors
+ PDF Chat Countering Malicious DeepFakes: Survey, Battleground, and Horizon 2022 Felix Juefei-Xu
Run Wang
Yihao Huang
Qing Guo
Lei Ma
Yang Liu
+ Evading DeepFake Detectors via Adversarial Statistical Consistency 2023 Yang Hou
Qing Guo
Yihao Huang
Xiaofei Xie
Lei Ma
Jianjun Zhao
+ PDF Chat Imperceptible Adversarial Examples For Fake Image Detection 2021 Quanyu Liao
Yuezun Li
Xin Wang
Bin Kong
Bin Zhu
Siwei Lyu
Youbing Yin
Qi Song
Xi Wu
+ Imperceptible Adversarial Examples for Fake Image Detection 2021 Quanyu Liao
Yuezun Li
Xin Wang
Bin Kong
Bin Zhu
Siwei Lyu
Youbing Yin
Qi Song
Xi Wu
+ PDF Chat Deep learning for deepfakes creation and detection: A survey 2022 Thanh Thi Nguyen
Quoc Viet Hung Nguyen
Dung T. Nguyen
Duc Thanh Nguyen
Thien Huynh‐The
Saeid Nahavandi
Thành Tâm Nguyên
Quoc‐Viet Pham
Cuong M. Nguyen
+ Protecting Against Image Translation Deepfakes by Leaking Universal Perturbations from Black-Box Neural Networks 2020 Nataniel Ruiz
Sarah Adel Bargal
Stan Sclaroff
+ PDF Chat Evading Deepfake-Image Detectors with White- and Black-Box Attacks 2020 Nicholas Carlini
Hany Farid
+ PDF Chat Deepfake Style Transfer Mixture: A First Forensic Ballistics Study on Synthetic Images 2022 Luca Guarnera
Oliver Giudice
Sebastiano Battiato
+ PDF Chat D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles 2024 Ashish Hooda
Neal Mangaokar
Ryan Feng
Kassem Fawaz
Somesh Jha
Atul Prakash
+ FakeTagger: Robust Safeguards against DeepFake Dissemination via Provenance Tracking 2020 Run Wang
Felix Juefei-Xu
Meng Luo
Yang Liu
Lina Wang