rb-aioutliers Main package

Main package to install redborder AI outliers in Rocky Linux 9

Platforms

Rocky Linux 9

Running the example

This code shows runs the outlier detection on a mock dataset. Its recomended to use pipenv or similar to avoid overwritting dependencies.

git clone [email protected]:redBorder/rb-aioutliers.git
cd rb-aioutliers
pip install -r resources/src/requirements.txt
bash resources/src/example/run_example.sh

Installation

yum install epel-release && rpm -ivh http://repo.redborder.com/redborder-repo-0.0.3-1.el7.rb.noarch.rpm
yum install rb-aioutliers

Model design

Initially, data is extracted from a designated druid datasource in timeseries format, with configurable metrics and settings. After rescaling from zero to one and segmentation, an autoencoder reconstructs the data, enabling anomaly detection through k-sigma thresholding. The anomalies are outputed in Json format together with the data reconstructed by the autoencoder.

Model execution

rb-aioutliers utilizes the Flask framework to create an HTTP server. Users can send Druid queries via POST requests to the /calculate endpoint. When rb-aioutliers receives the Druid query, it sends a request to the Druid broker, retrieves the necessary data, and then proceeds to execute the anomaly detection model.

After executing the model, the server can respond with one of two status messages:

HTTP OK 200 Success: In this case, the response body will be structured as follows:

{
    "status": "success",
    "anomalies": [Array_of_anomalies],
    "prediction": [Array_of_predictions]
}

HTTP 500 Internal Server Error: If there is an issue during the process, the response will contain an error message:

{
    "status": "error",
    "msg": "Error_description_message"
}

Model training

The rb-aioutliers service generates a custom Druid query and sends it to the Redborder cluster Druid broker. After sending the query, it retrieves the data and attempts to train a model with custom parameters such as epochs or batch size. Once the model training is complete, it outputs a backup file and the generated Keras model.

Clusterization with Chef, Consul, AWS S3 & Redis

The rb-aioutliers service can operate in cluster mode! This is achieved by dividing the service into two components: the executor and the trainer.

Executor

The executor service is registered in HashiCorp Consul. When the Redborder WebUI sends a request to api/v1/outliers, it will be directed to the rb-aioutliers.service, which, in turn, will redirect the request to any of the nodes running the rb-aioutliers REST server. This server will download the relevant model from S3, based on the sensor specified in the Druid query, and execute it.

Trainer

For the training service, the Chef client will create individual configuration files for each node, which are generated based on the cookbook and templates. These configuration files specify which sensor should be trained for. The rb-aioutliers-train service will then proceed to (take a look at Trainer jobs below) download the appropriate model from S3 for each node. Once the training is completed, the rb-aioutliers-train service will upload the resulting trained model to S3. It's important to note that each model is unique and specific to the corresponding sensor.

Trainer Jobs

The trainer nodes send job data to a Redis server (this is done by the Trainer service). Each node also requests the job queue. The RQ worker processes these jobs in the background. If any node goes down, there will be no problem, as the sensor information for training is stored in Redis, and another node can take over its tasks.

For more info about deploy with Chef Server take a look at Outliers Cookbook

Docker support

If you want to run the app inside a docker container run the following commands

cd ./resources/src
docker-compose up --build -d

now if you list your docker container you will see the following container running.

Container ID	Image	Command	Created	Status	Exposed Ports
cb18a72ab60e	src_rb_aioutliers_rest	python main.py	3 minutes ago	Up 3 minutes	0.0.0.0:39091->39091/tcp, :::39091->39091/tcp

API Endpoints

`/calculate`

HTTP Method: POST
Description: Initiates anomaly detection (model execution) with a Druid query.
Request Body: JSON data containing the Druid query in base64 string.

Example Request:

POST /calculate (application-x-www-form-urlencoded)
query=base64_string

Example Druid Query:

{
  "dataSource": "rb_flow",
  "granularity": {
    "type": "period",
    "period": "pt1m",
    "origin": "2023-09-21T09:00:00Z"
  },
  "intervals": [
    "2023-09-21T09:00:00+00:00/2023-09-21T10:00:00+00:00"
  ],
  "filter": {
    "type": "selector",
    "dimension": "sensor_name",
    "value": "FlowSensor"
  },
  "queryType": "timeseries",
  "context": {
    "timeout": 90000,
    "skipEmptyBuckets": "true"
  },
  "limitSpec": {
    "type": "default",
    "limit": 100,
    "columns": []
  },
  "aggregations": [
    {
      "type": "longSum",
      "name": "bytes",
      "fieldName": "sum_bytes"
    }
  ],
  "postAggregations": []
}

Example Respone:

{
  "anomalies": [
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 36453984.6858499,
      "timestamp": "2023-09-28T07:00:00.000Z"
    },
    {
      "expected": 45825798.264862545,
      "timestamp": "2023-09-28T07:01:00.000Z"
    }
  ],
  "status": "success"
}

Contributing

Fork the repository on Github
Create a named feature branch (like add_component_x)
Write your change
Write tests for your change (if applicable)
Run the tests, ensuring they all pass
Submit a Pull Request using Github

License and Authors

Miguel Álvarez Adsuara [email protected]
Pablo Rodriguez Flores [email protected]

LICENSE: AFFERO GENERAL PUBLIC LICENSE, Version 3, 19 November 2007

Name		Name	Last commit message	Last commit date
Latest commit History 317 Commits
.github		.github
packaging/rpm		packaging/rpm
resources		resources
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rb-aioutliers Main package

Platforms

Running the example

Installation

Model design

Model execution

Model training

Clusterization with Chef, Consul, AWS S3 & Redis

Executor

Trainer

Trainer Jobs

Docker support

API Endpoints

`/calculate`

Contributing

License and Authors

About

Releases 14

Packages

Contributors 8

Languages

License

redBorder/rb-aioutliers

Folders and files

Latest commit

History

Repository files navigation

rb-aioutliers Main package

Platforms

Running the example

Installation

Model design

Model execution

Model training

Clusterization with Chef, Consul, AWS S3 & Redis

Executor

Trainer

Trainer Jobs

Docker support

API Endpoints

/calculate

Contributing

License and Authors

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 14

Packages 0

Contributors 8

Languages

`/calculate`

Packages