Skip to content

redBorder/rb-aioutliers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rb-aioutliers Main package

test lint security check License: AGPL v3 codecov FOSSA Status

FOSSA Status

Main package to install redborder AI outliers in Rocky Linux 9

Platforms

  • Rocky Linux 9

Running the example

This code shows runs the outlier detection on a mock dataset. Its recomended to use pipenv or similar to avoid overwritting dependencies.

git clone [email protected]:redBorder/rb-aioutliers.git
cd rb-aioutliers
pip install -r resources/src/requirements.txt
bash resources/src/example/run_example.sh

Installation

  1. yum install epel-release && rpm -ivh http://repo.redborder.com/redborder-repo-0.0.3-1.el7.rb.noarch.rpm

  2. yum install rb-aioutliers

Model design

Initially, data is extracted from a designated druid datasource in timeseries format, with configurable metrics and settings. After rescaling from zero to one and segmentation, an autoencoder reconstructs the data, enabling anomaly detection through k-sigma thresholding. The anomalies are outputed in Json format together with the data reconstructed by the autoencoder.

Model execution

rb-aioutliers utilizes the Flask framework to create an HTTP server. Users can send Druid queries via POST requests to the /calculate endpoint. When rb-aioutliers receives the Druid query, it sends a request to the Druid broker, retrieves the necessary data, and then proceeds to execute the anomaly detection model.

img

After executing the model, the server can respond with one of two status messages:

HTTP OK 200 Success: In this case, the response body will be structured as follows:

{
    "status": "success",
    "anomalies": [Array_of_anomalies],
    "prediction": [Array_of_predictions]
}

HTTP 500 Internal Server Error: If there is an issue during the process, the response will contain an error message:

{
    "status": "error",
    "msg": "Error_description_message"
}

Model training

The rb-aioutliers service generates a custom Druid query and sends it to the Redborder cluster Druid broker. After sending the query, it retrieves the data and attempts to train a model with custom parameters such as epochs or batch size. Once the model training is complete, it outputs a backup file and the generated Keras model.

img

Clusterization with Chef, Consul, AWS S3 & Redis

The rb-aioutliers service can operate in cluster mode! This is achieved by dividing the service into two components: the executor and the trainer.

Executor

The executor service is registered in HashiCorp Consul. When the Redborder WebUI sends a request to api/v1/outliers, it will be directed to the rb-aioutliers.service, which, in turn, will redirect the request to any of the nodes running the rb-aioutliers REST server. This server will download the relevant model from S3, based on the sensor specified in the Druid query, and execute it.

img

Trainer

For the training service, the Chef client will create individual configuration files for each node, which are generated based on the cookbook and templates. These configuration files specify which sensor should be trained for. The rb-aioutliers-train service will then proceed to (take a look at Trainer jobs below) download the appropriate model from S3 for each node. Once the training is completed, the rb-aioutliers-train service will upload the resulting trained model to S3. It's important to note that each model is unique and specific to the corresponding sensor.

img

Trainer Jobs

The trainer nodes send job data to a Redis server (this is done by the Trainer service). Each node also requests the job queue. The RQ worker processes these jobs in the background. If any node goes down, there will be no problem, as the sensor information for training is stored in Redis, and another node can take over its tasks.

img

For more info about deploy with Chef Server take a look at Outliers Cookbook

Docker support

If you want to run the app inside a docker container run the following commands

cd ./resources/src
docker-compose up --build -d

now if you list your docker container you will see the following container running.

Container ID Image Command Created Status Exposed Ports
cb18a72ab60e src_rb_aioutliers_rest python main.py 3 minutes ago Up 3 minutes 0.0.0.0:39091->39091/tcp, :::39091->39091/tcp

API Endpoints

/calculate

  • HTTP Method: POST
  • Description: Initiates anomaly detection (model execution) with a Druid query.
  • Request Body: JSON data containing the Druid query in base64 string.

Example Request:

POST /calculate (application-x-www-form-urlencoded)
query=base64_string

Example Druid Query:

{
  "dataSource": "rb_flow",
  "granularity": {
    "type": "period",
    "period": "pt1m",
    "origin": "2023-09-21T09:00:00Z"
  },
  "intervals": [
    "2023-09-21T09:00:00+00:00/2023-09-21T10:00:00+00:00"
  ],
  "filter": {
    "type": "selector",
    "dimension": "sensor_name",
    "value": "FlowSensor"
  },
  "queryType": "timeseries",
  "context": {
    "timeout": 90000,
    "skipEmptyBuckets": "true"
  },
  "limitSpec": {
    "type": "default",
    "limit": 100,
    "columns": []
  },
  "aggregations": [
    {
      "type": "longSum",
      "name": "bytes",
      "fieldName": "sum_bytes"
    }
  ],
  "postAggregations": []
}

Example Respone:

{
  "anomalies": [
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 45825798.264862545,
      "timestamp": "2023-09-28T07:01:00.000Z"
    }
  ],
  "status": "success"
}

Contributing

  1. Fork the repository on Github
  2. Create a named feature branch (like add_component_x)
  3. Write your change
  4. Write tests for your change (if applicable)
  5. Run the tests, ensuring they all pass
  6. Submit a Pull Request using Github

License and Authors

LICENSE: AFFERO GENERAL PUBLIC LICENSE, Version 3, 19 November 2007