Skip to content

Commit

Permalink
docs: update guides and API refs (#38)
Browse files Browse the repository at this point in the history
  • Loading branch information
micpst authored May 27, 2024
1 parent 2fb275f commit bc912c1
Show file tree
Hide file tree
Showing 32 changed files with 300 additions and 92 deletions.
118 changes: 118 additions & 0 deletions docs/how-to/llms/custom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# How-To: Create Custom LLM

LLM is one of the main components of the db-ally ecosystem. It handles all interactions with the selected Large Language Model. It is used for operations like view selection, IQL generation and natural language response generation, therefore it is essential to be able to integrate with any LLM API you may encounter.

## Implementing a Custom LLM

The [`LLM`](../../reference/llms/index.md#dbally.llms.base.LLM) class is an abstract base class that provides a framework for interacting with a Large Language Model. To create a custom LLM, you need to create a subclass of [`LLM`](../../reference/llms/index.md#dbally.llms.base.LLM) and implement the required methods and properties.

Here's a step-by-step guide:

### Step 1: Define the subclass

First, define your subclass and specify the type of options it will use.

```python
from dbally.llms.base import LLM
from dbally.llms.litellm import LiteLLMOptions

class MyLLM(LLM[LiteLLMOptions]):
_options_cls = LiteLLMOptions
```

In this example we will be using [`LiteLLMOptions`](../../reference/llms/litellm.md#dbally.llms.clients.litellm.LiteLLMOptions), which contain all options supported by most popular LLM APIs. If you need a different interface, see [Customising LLM Options](#customising-llm-options) to learn how to implement it.

### Step 2: Create the custom LLM client

The [`client`](../../reference/llms/index.md#dbally.llms.base.LLM.client) property is an abstract method that must be implemented in your subclass. This property should return an instance of [`LLMClient`](../../reference/llms/index.md#dbally.llms.clients.base.LLMClient) that your LLM will use to interact with the model.

```python
class MyLLM(LLM[LiteLLMOptions]):
_options_cls = LiteLLMOptions

@cached_property
def client(self) -> MyLLMClient:
return MyLLMClient()
```

`MyLLMClient` should be a class that implements the [`LLMClient`](../../reference/llms/index.md#dbally.llms.clients.base.LLMClient) interface.

```python
from dbally.llms.clients.base import LLMClient

class MyLLMClient(LLMClient[LiteLLMOptions]):

async def call(
self,
prompt: ChatFormat,
response_format: Optional[Dict[str, str]],
options: LiteLLMOptions,
event: LLMEvent,
) -> str:
# Your LLM API call
```

The [`call`](../../reference/llms/index.md#dbally.llms.clients.base.LLMClient.call) method is an abstract method that must be implemented in your subclass. This method should call the LLM inference API and return the response.

### Step 3: Use tokenizer to count tokens

The [`count_tokens`](../../reference/llms/index.md#dbally.llms.base.LLM.count_tokens) method is used to count the number of tokens in the messages. You can override this method in your custom class to use the tokenizer and count tokens specifically for your model.

```python
class MyLLM(LLM[LiteLLMOptions]):

def count_tokens(self, messages: ChatFormat, fmt: Dict[str, str]) -> int:
# Count tokens in the messages in a custom way
```
!!!warning
Incorrect token counting can cause problems in the [`NLResponder`](../../reference/nl_responder.md#dbally.nl_responder.nl_responder.NLResponder) and force the use of an explanation prompt template that is more generic and does not include specific rows from the IQL response.

### Step 4: Define custom prompt formatting

The [`format_prompt`](../../reference/llms/index.md#dbally.llms.base.LLM.format_prompt) method is used to apply formatting to the prompt template. You can override this method in your custom class to change how the formatting is performed.

```python
class MyLLM(LLM[LiteLLMOptions]):

def format_prompt(self, template: PromptTemplate, fmt: Dict[str, str]) -> ChatFormat:
# Apply custom formatting to the prompt template
```
!!!note
In general, implementation of this method is not required unless the LLM API does not support [OpenAI conversation formatting](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages){:target="_blank"}. If your model API expects a different format, override this method to avoid issues with inference call.

## Customising LLM Options

[`LLMOptions`](../../reference/llms/index.md#dbally.llms.clients.base.LLMOptions) is a class that defines the options your LLM will use. To create a custom options, you need to create a subclass of [`LLMOptions`](../../reference/llms/index.md#dbally.llms.clients.base.LLMOptions) and define the required properties that will be passed to the [`LLMClient`](../../reference/llms/index.md#dbally.llms.clients.base.LLMClient).

```python
from dbally.llms.base import LLMOptions

@dataclass
class MyLLMOptions(LLMOptions):
temperature: float
max_tokens: int = 4096
```

Each property should be annotated with its type. You can also provide default values if necessary. Don't forget to update the custom LLM class signatures.

```python
class MyLLM(LLM[MyLLMOptions]):
_options_cls = MyLLMOptions

class MyLLMClient(LLMClient[MyLLMOptions]):
...
```

## Using the Custom LLM

Once your subclass is defined, you can instantiate and use it with your collection like this.

```python
import dbally

llm = MyLLM("my_model", MyLLMOptions(temperature=0.5))
my_collection = dbally.create_collection("my_collection", llm)
response = await my_collection.ask("Which LLM should I use?")
```

Now your custom model powers the db-ally engine for querying structured data. Have fun!
98 changes: 98 additions & 0 deletions docs/how-to/llms/litellm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# How-To: Use LiteLLM models

db-ally comes with ready-to-use LLM implementation called [`LiteLLM`](../../reference/llms/litellm.md#dbally.llms.litellm.LiteLLM) that uses the litellm package under the hood, providing access to all major LLM APIs such as OpenAI, Anthropic, VertexAI, Hugging Face and more.

## Basic Usage

Install litellm extension.

```bash
pip install dbally[litellm]
```

Integrate db-ally with your LLM vendor.

=== "OpenAI"

```python
import os
from dbally.llms.litellm import LiteLLM

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-api-key"

llm=LiteLLM(model_name="gpt-4o")
```

=== "Anthropic"

```python
import os
from dbally.llms.litellm import LiteLLM

## set ENV variables
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

llm=LiteLLM(model_name="claude-3-opus-20240229")
```

=== "Anyscale"

```python
import os
from dbally.llms.litellm import LiteLLM

## set ENV variables
os.environ["ANYSCALE_API_KEY"] = "your-api-key"

llm=LiteLLM(model_name="anyscale/meta-llama/Llama-2-70b-chat-hf")
```

Use LLM in your collection.

```python
my_collection = dbally.create_collection("my_collection", llm)
response = await my_collection.ask("Which LLM should I use?")
```

## Advanced Usage

For more advanced users, you may also want to parametrize your LLM using [`LiteLLMOptions`](../../reference/llms/litellm.md#dbally.llms.clients.litellm.LiteLLMOptions). Here is the list of availabe parameters:

- `frequency_penalty`: *number or null (optional)* - It is used to penalize new tokens based on their frequency in the text so far.

- `max_tokens`: *integer (optional)* - The maximum number of tokens to generate in the chat completion.

- `n`: *integer or null (optional)* - The number of chat completion choices to generate for each input message.

- `presence_penalty`: *number or null (optional)* - It is used to penalize new tokens based on their existence in the text so far.

- `seed`: *integer or null (optional)* - This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.

- `stop`: *string/ array/ null (optional)* - Up to 4 sequences where the API will stop generating further tokens.

- `temperature`: *number or null (optional)* - The sampling temperature to be used, between 0 and 2. Higher values like 0.8 produce more random outputs, while lower values like 0.2 make outputs more focused and deterministic.

- `top_p`: *number or null (optional)* - An alternative to sampling with temperature. It instructs the model to consider the results of the tokens with top_p probability. For example, 0.1 means only the tokens comprising the top 10% probability mass are considered.

```python
import dbally

llm = MyLLM("my_model", LiteLLMOptions(temperature=0.5))
my_collection = dbally.create_collection("my_collection", llm)
```

You can also override any default parameter on [`ask`](../../reference/collection.md#dbally.Collection.ask) call.

```python
response = await my_collection.ask(
question="Which LLM should I use?",
llm_options=LiteLLMOptions(
temperature=0.65,
max_tokens=1024,
),
)
```

!!!warning
Some parameters are not compatible with some models and may cause exceptions, check [LiteLLM documentation](https://docs.litellm.ai/docs/completion/input#translated-openai-params){:target="_blank"} for supported options.
2 changes: 1 addition & 1 deletion docs/quickstart/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Quickstart Guide 1
# Quickstart: Intro

This guide will help you get started with db-ally. We will use a simple example to demonstrate how to use db-ally to query a database using an AI model. We will use OpenAI's GPT to generate SQL queries based on natural language questions and SqlAlchemy to interact with the database.

Expand Down
4 changes: 2 additions & 2 deletions docs/quickstart/quickstart2.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Quickstart Guide 2: Semantic Similarity
# Quickstart: Semantic Similarity

This guide is a continuation of the [Quickstart](./index.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code here: [quickstart_code.py](quickstart_code.py).
This guide is a continuation of the [Intro](./index.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code here: [quickstart_code.py](quickstart_code.py).

This guide will demonstrate how to use semantic similarity to handle queries in which the filter values are similar to those in the database, without requiring an exact match. We will use filtering by country as an example.

Expand Down
4 changes: 2 additions & 2 deletions docs/quickstart/quickstart3.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Quickstart Guide 3: Multiple Views
# Quickstart: Multiple Views

This guide continues from [Quickstart Guide 2](./quickstart2.md). It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 2 code here: [quickstart2_code.py](quickstart2_code.py).
This guide continues from [Semantic Similarity](./quickstart2.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 2 code here: [quickstart2_code.py](quickstart2_code.py).

The guide illustrates how to use multiple views to handle queries requiring different types of data. `CandidateView` and `JobView` are used as examples.

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/embeddings/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# EmbeddingClient

::: dbally.embedding_client.EmbeddingClient
::: dbally.embeddings.EmbeddingClient
3 changes: 3 additions & 0 deletions docs/reference/embeddings/litellm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# LiteLLMEmbeddingClient

::: dbally.embeddings.LiteLLMEmbeddingClient
5 changes: 0 additions & 5 deletions docs/reference/embeddings/openai.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/reference/event_handlers/langsmith_handler.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# LangmisthEventHandler
# LangSmithEventHandler

::: dbally.audit.LangSmithEventHandler

2 changes: 1 addition & 1 deletion docs/reference/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# db-ally root API
# dbally


::: dbally.create_collection
5 changes: 2 additions & 3 deletions docs/reference/iql/iql_generator.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

::: dbally.iql_generator.iql_generator.IQLGenerator

::: dbally.data_models.prompts.iql_prompt_template.default_iql_template

::: dbally.data_models.prompts.IQLPromptTemplate
::: dbally.iql_generator.iql_prompt_template.IQLPromptTemplate

::: dbally.iql_generator.iql_prompt_template.default_iql_template
15 changes: 0 additions & 15 deletions docs/reference/llm/index.md

This file was deleted.

3 changes: 0 additions & 3 deletions docs/reference/llm/llm_options.md

This file was deleted.

3 changes: 0 additions & 3 deletions docs/reference/llm/openai.md

This file was deleted.

17 changes: 0 additions & 17 deletions docs/reference/llm/prompt_builder.md

This file was deleted.

7 changes: 7 additions & 0 deletions docs/reference/llms/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# LLM

::: dbally.llms.base.LLM

::: dbally.llms.clients.base.LLMClient

::: dbally.llms.clients.base.LLMOptions
7 changes: 7 additions & 0 deletions docs/reference/llms/litellm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# LiteLLM

::: dbally.llms.litellm.LiteLLM

::: dbally.llms.clients.litellm.LiteLLMClient

::: dbally.llms.clients.litellm.LiteLLMOptions
4 changes: 2 additions & 2 deletions docs/reference/nl_responder.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ Otherwise, a response is generated using a `nl_responder_prompt_template`.

::: dbally.nl_responder.nl_responder.NLResponder

::: dbally.data_models.prompts.query_explainer_prompt_template
::: dbally.nl_responder.query_explainer_prompt_template

::: dbally.data_models.prompts.nl_responder_prompt_template.default_nl_responder_template
::: dbally.nl_responder.nl_responder_prompt_template.default_nl_responder_template
2 changes: 1 addition & 1 deletion docs/reference/similarity/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ Explore [Similarity Stores](./similarity_store/index.md) and [Similarity Fetcher
* [How-To: Use Similarity Indexes with Data from Custom Sources](../../how-to/use_custom_similarity_fetcher.md)
* [How-To: Store Similarity Index in a Custom Store](../../how-to/use_custom_similarity_store.md)
* [How-To: Update Similarity Indexes](../../how-to/update_similarity_indexes.md)
* [Quickstart Guide 2: Semantic Similarity](../../quickstart/quickstart2.md)
* [Quickstart: Semantic Similarity](../../quickstart/quickstart2.md)

::: dbally.similarity.SimilarityIndex
2 changes: 1 addition & 1 deletion docs/reference/view_selection/llm_view_selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

::: dbally.view_selection.LLMViewSelector

::: dbally.data_models.prompts.default_view_selector_template
::: dbally.view_selection.view_selector_prompt_template.default_view_selector_template
22 changes: 15 additions & 7 deletions docs/about/roadmap.md → docs/roadmap.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
hide:
- navigation
---

# Roadmap

db-ally is actively developed and maintained by a core team at [deepsense.ai](https://deepsense.ai) and a community of contributors.
Expand Down Expand Up @@ -30,13 +35,16 @@ Below you can find a list of planned integrations.
- [ ] HTTP REST Endpoints
- [ ] GraphQL Endpoints

### LLMs
### LLM Providers

- [x] OpenAI
- [x] Anthropic
- [x] VertexAI
- [x] Hugging Face
- [x] Bedrock
- [x] Azure

- [x] OpenAI models
- [ ] Claude 3
- [ ] LLama-2
- [ ] Mistral / Mixtral
- [ ] VertexAI Gemini
And many more, the full list can be found in the [LiteLLM documentation](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-providers-docs){:target="_blank"}

### Vector stores

Expand All @@ -50,5 +58,5 @@ Below you can find a list of planned integrations.

### Query tracking

- [x] Langsmith
- [x] LangSmith
- [ ] OpenTelemetry
Loading

0 comments on commit bc912c1

Please sign in to comment.