Skip to content

triton-inference-server/redis_cache

Repository files navigation

License

Triton Redis Cache

This repo contains an example cache for caching data with Redis.

Ask questions or report problems in the main Triton issues page.

Build the Cache

If you don't have it installed already - install rapidjson-dev:

apt install rapidjson-dev

Use a recent cmake to build and run the following:

$ mkdir build
$ cd build
$ cmake -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install ..
$ make install

The following required Triton repositories will be pulled and used in the build. By default the "main" branch/tag will be used for each repo but the following CMake arguments can be used to override.

  • triton-inference-server/core: -D TRITON_CORE_REPO_TAG=[tag]
  • triton-inference-server/common: -D TRITON_COMMON_REPO_TAG=[tag]

Using the Cache

Deploying to Triton

In order for the Redis Cache to be deployed to triton, you must build the binary (see build instructions), and copy the libtritoncache_redis.so file to the folder redis in the cache directory on the server you are running triton from, by default this will be /opt/tritonserver/caches - but this can be adjusted by use of the --cache-dir CLI option as needed.

It is also required that Redis be running on a system reachable by Triton. There are many ways to deploy Redis, to learn how to get started with Redis look at Redis's getting started guide.

Configuration

The cache is configured by the using --cache-config CLI options. The --cache-config option is variadic, meaning it can be repeated multiple times to set multiple configuration fields. The format of a --cache-config option is <cache_name>,<key>=<value>. At a minimum you must provide a host and port to allow the client to connect to Redis e.g. let's try connecting to a redis instance living on the host redis-host and listening on port 6379:

tritonserver --cache-config redis,host=redis-host --cache-config redis,port=6379

Available Configuration Options

Configuration Option Required Description Default
host Yes The hostname or IP address of the server where Redis is running. N/A
port Yes The port number to connect to on the server. N/A
user No The username to use for authentication of the ACLs to the Redis Server default
password No The password to Redis. N/A
db No The db number to user. NOTE - use of the db number is considered an anti-pattern in Redis, so it is advised that you do not use this option 0
connect_timeout No The maximum time, in milliseconds to wait for a connection to be established to Redis. 0 means wait forever 0
socket_timeout No The maximum time, in milliseconds the client will wait for a response from Redis. 0 means wait forever 0
pool_size No The number pooled connections to Redis the client will maintain. 1
wait_timeout No The maximum time, in milliseconds to wait for a connection from the pool. 1000

Optional Environment Variables for Credentials

Optionally you may configure your user/password via environment variables. The corresponding user environment variable is TRITONCACHE_REDIS_USERNAME whereas the corresponding password environment variable is TRITONCACHE_REDIS_PASSWORD.

TLS

Transport Layer Security (TLS) can be enabled in Redis and within the Triton Redis Cache, to do so you will need a TLS enabled version of Redis, e.g. OSS Redis or Redis Enterprise. You will also need to configure Triton Server to use TLS with Redis through the following --cache-config TLS options.

Configuration Items for TLS

Configuration Option Required Description
tls_enabled Yes set to true to enable TLS
cert no The certificate to use for TLS.
key no The certificate key to use for TLS.
cacert No The Certificate Authority certificate to use for TLS.
sni No Server name indication for TLS.

Monitoring and Observability

There are many ways to go about monitoring what's going on in Redis. One popular mode is to export metrics data from Redis to Prometheus, and use Grafana to observe them.

Example

You can try out the Redis Cache with Triton in docker:

  • clone this repo: git clone https://github.com/triton-inference-server/redis_cache
  • follow build instructions enumerated above
  • clone the Triton server repo: git clone https://github.com/triton-inference-server
  • Add the following to: docs/examples/model_repository/densenet_onnx/config.pbtxt
response_cache{
  enable:true
}
docker login nvcr.io

Username: $oauthtoken
Password: <MY API KEY>

NOTE: Username: $oauthtoken in this context means that your username is literally $oauthtoken - your API key serves as the unique part of your credentials

  • run docker-compose build
  • run docker-compose up
  • In a separate terminal run docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.06-py3-sdk
  • Run /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
    • on the first run - this will miss the cache
    • subsequent runs will pull the inference out of the cache
    • you can validate this by watching Redis with docker exec -it redis_cache_triton-redis_1 redis-cli monitor