-
Notifications
You must be signed in to change notification settings - Fork 867
NAB Entry Points
NAB allows users to test their detection algorithms in three ways: (1) creating a detector, (2) giving anomaly scores, and (3) giving detections. These methods are detailed below, and illustrated within the NAB pipeline in this diagram. We also provide an [example of option 2](Twitter Anomaly Detector), which is how we ran the Twitter AnomalyDetection algorithms in NAB. Please also see [Reporting results with NAB](Reporting results with NAB).
Option 1: Create a detector (e.g. named "alpha") by subclassing AnomalyDetector
in nab/detectors/base.py, with the code stored in nab/detectors/alpha/. We did this with the Etsy Skyline algorithm. Add an import conditional to run.py
for your detector. With your detector added, simply run python run.py -d alpha
.
Option 2: Give NAB anomaly scores before the threshold optimization phase. Do this by running your anomaly detection algorithm (e.g. "alpha") on the NAB dataset, generating results files in the NAB format: a CSV file for each data file, with columns for timestamp, value, anomaly_score, and label (please see Appendix C of the whitepaper for additional formatting specs). We include a script to create the necessary directories and entry in the config/thresholds.json file: python scripts/create_new_detector.py --detector alpha
. Add your results files to the NAB results directory, which should now look like this:
nab/results
null/
...
alpha/
artificialNoAnomaly/
...
realTweets/
alpha_Twitter_volume_AAPL.csv
...
Now running python run.py -d alpha --optimize --score --normalize
will optimize the anomaly score threshold for your algorithm's detections, run the scoring algorithm, and normalize the raw scores to yield your final NAB scores.
Option 3: Give NAB the anomaly detections. This is a slight variation from Option 2, where now you need to manually enter threshold values in the config/thresholds.json for your detector. This is very straightforward: simply assign the threshold values for the application profiles to be a value less than the anomaly scores that are outputted by your algorithm. For example, your results files can have binary anomaly scores for each record, and the thresholds can all be 0.5; we did this with the Twitter ADVec detector.
If after running NAB with your detector you are interested in having your results posted to the scoreboard, please email [email protected] or submit a pull request. To consider adding your results we must be able to fully reproduce them.