Implementation of adversarial attack on different deep NN classifiers, the attacks are base on the algorithms in the papers :
ATTACKS :
- LBFGS Attack : Explaining and harnessing adversarial examples.
- FGSM : Intriguing properties of neural networks.
- Vanilla Attack :
DEFENSES :
- Distilled neural network : Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
- Adversarial Training
- Binary thresholding
├── Adversarial_blackbox_attacks.ipynb <- Notebook for testing BB attacks>
├── Adversarial_whitebox_attacks.ipynb <- Notebook for testing WB attacks>
├── attack.py <- Test functions of attacks>
├── Attacks
│ ├── FGSM.py <- FGSM fast attack class>
│ ├── LBFGS.py <- L-BFGS attack class >
│ └── VanillaGradient.py <- Vanilla attack class >
├── Defense.ipynb <- Notebook for testing defenses>
├── defense.py <- Test functions of defense>
├── imagenet_classes.txt
├── Net.py <- Architectures of models >
├── Results <- Resulting images and accuracies >
├── utils.py <- Plotting functions >
└── weights <- Weights for pretrained models>
Our work was inspired by Adversarial Attacks and Defences Competition, we implemented 3 differents attack vectors and 3 matching defenses.
Adversarial_whitebox_attacks.ipynb
: We first implemented the attacks on the architectureNet.py
with MNIST dataset, the notebook show the impact of our different attacks on the accuracy of the modelDefense.ipynb
: This notebook showcases the robustness of 3 different defenses against the attacks. You'll find the accuracy measure of the model when adding the defense. TheL-BFGS
attack was left out of the testing because the high computational cost of the attack.Adversarial_blackbox_attacks.ipynb
: One very interesting feature of adversarial examples is their ability to transfer to different models. We tested this unique property by attack a model based on image generated froma different one. We used a more complex dataset (ants/bees) of 3 channels images from this test.