Evaluation

This wiki page lists methods and ideas that can be used to score models with respect to their robustness against adversarial attacks.

Linear Combinations

Feed linear combinations of two inputs and check whether the classification around the samples is correct. Determine the distance from an image (when linearly approaching an image of another class) of the first miss-classified input. Analyze how noisy the classifications along the line are.
This figure plots classification over linear combination between a "1" and a "0" sample from the training data. Our first experiments can be found here.

Goodfellow is presenting a similar thing here and shows that the classification will work just fine in most directions except for a few. Therefore, the linear combination method might not be efficient in terms of spotting vulnerabilities of a model.

Activations

Plotting a histogram of activations and getting a sense for how they behave differently when feeding adversarial examples vs. normal samples.

Machine Learning
Infrastructure
- Google Cloud Platform Analysis
- Training Pipeline
Challenge
- Submissions
Misc
- Repository Conventions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation

Linear Combinations

Activations

Clone this wiki locally