A Retrieval Augmented Generation example with Azure, using Azure OpenAI Service, Azure Cognitive Search, embeddings, and a sample CSV file to produce a powerful grounding to applications that want to deliver customized generative AI applications.
Use the requirements.txt
to install all dependencies
python -m venv .venv
./.venv/bin/pip install -r requirements.txt
Find the Azure OpenAI Keys in the Azure OpenAI Service. Note, that keys aren't in the studio, but in the resource itself. Add them to a local .env
file. This repository ignores the .env
file to prevent you (and me) from adding these keys by mistake.
Your .env
file should look like this:
# Azure OpenAI
OPENAI_API_TYPE="azure"
OPENAI_API_BASE="https://demo-alfredo-openai.openai.azure.com/"
OPENAI_API_KEY="0asd8924yl87asljhsd823lkjahsdf234"
OPENAI_API_VERSION="2023-07-01-preview"
# Azure Cognitive Search
SEARCH_SERVICE_NAME="https://demo-alfredo.search.windows.net"
SEARCH_API_KEY="zlkjhasd876lkjh234978sg098srtiuy"
SEARCH_INDEX_NAME="demo-index"
Note that the Azure Cognitive Search is only needed if you are following the Retrieval Augmented Guidance (RAG) demo. It isn't required for a simple Chat application.
The access token will need to be added as an Action secret. Create one with enough permissions to write to packages. It is needed because Azure will need to authenticate against the GitHub Container Registry to pull the image.
You'll need the following:
- An Azure subscription ID find it here or follow this guide
- A Service Principal with the following details the AppID, password, and tenant information. Create one with:
az ad sp create-for-rbac -n "REST API Service Principal"
and assign the IAM role for the subscription. Alternatively set the proper role access using the following command (use a real subscription id and replace it):
az ad sp create-for-rbac --name "CICD" --role contributor --scopes /subscriptions/$AZURE_SUBSCRIPTION_ID --sdk-auth
Make sure you have one instance already created, and then capture the name and resource group. These will be used in the workflow file.
Make sure you use 2 CPU cores and 4GB of memory per container. Otherwise you may get an error because loading HuggingFace with FastAPI requires significant memory upfront.
There are a few things that might get you into a failed state when deploying:
- Not having enough RAM per container
- Not using authentication for accessing the remote registry (ghcr.io in this case). Authentication is always required
- Not using a
GITHUB_TOKEN
or not setting the write permissions for "packages". Go tosettings/actions
and make sure that "Read and write permissions" is set for "Workflow permissions" section - Different port than 80 in the container. By default Azure Container Apps use 80. Update to match the container.
If running into trouble, check logs in the portal or use the following with the Azure CLI:
az containerapp logs show --name $CONTAINER_APP_NAME --resource-group $RESOURCE_GROUP_NAME --follow
Update both variables to match your environment
Although there are a few best practices for using the FastAPI framework, there are many different other suggestions to build solid HTTP APIs that can be applicable anywhere.
The HTTP specification has several error codes available. Make use of the appropriate error code to match the condition that caused it. For example the 401
HTTP code can be used when access is unauthorized. You shouldn't use a single error code as a catch-all error.
Here are some common scenarios associated with HTTP error codes:
400 Bad request
: Use this to indicate a schema problem. For example if the server expected a string but got an integer401 Unauthorized
: When authentication is required and it wasn't present or satisfied404 Not found
: When the resource doesn't exist
Note that it is a good practice to use 404 Not Found
to protect from requests that try to find if a resource exists without being authenticated. A good example of this is a service that doesn't want to expose usernames unless you are authenticated.
GET | POST | PUT | HEAD |
---|---|---|---|
Read Only | Write Only | Update existing | Does it exist? |