Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs Agent] Release of Docs Agent v.0.3.4 #408

Merged
merged 12 commits into from
May 10, 2024
121 changes: 50 additions & 71 deletions examples/gemini/python/docs-agent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ and project deployment.

## Overview

Docs Agent apps use a technique known as Retrieval Augmented Generation (RAG), which allows
you to bring your own documents as knowledge sources to AI language models. This approach
helps the AI language models to generate relevant and accurate responses that are grounded
in the information that you provide and control.
Docs Agent apps use a technique known as Retrieval Augmented Generation (RAG), which
allows you to bring your own documents as knowledge sources to AI language models.
This approach helps the AI language models to generate relevant and accurate responses
that are grounded in the information that you provide and control.

![Docs Agent architecture](docs/images/docs-agent-architecture-01.png)

Expand All @@ -26,66 +26,43 @@ the [Set up Docs Agent][set-up-docs-agent] section below.

The following list summarizes the tasks and features supported by Docs Agent:

- **Process Markdown**: Split Markdown files into small plain text files. (See the
Python scripts in the [`preprocess`][preprocess-dir] directory.)
- **Generate embeddings**: Use an embedding model to process small plain text files
into embeddings, and store them in a vector database. (See the
[`populate_vector_database.py`][populate-vector-database] script.)
- **Perform semantic search**: Compare embeddings in the vector database to retrieve
most relevant content given user questions.
- **Add context to a user question**: Add a list of text chunks returned from
a semantic search as context in a prompt. (See the
[Structure of a prompt to a language model][prompt-structure] section.)
- **(Experimental) “Fact-check” responses**: This experimental feature composes
- **Process Markdown**: Split Markdown files into small plain text chunks. (See
[Docs Agent chunking process][chunking-process].)
- **Generate embeddings**: Use an embedding model to process text chunks into embeddings
and store them in a vector database.
- **Perform semantic search**: Compare embeddings in a vector database to retrieve
chunks that are most relevant to user questions.
- **Add context to a user question**: Add chunks returned from a semantic search as
[context][prompt-structure] to a prompt.
- **Fact-check responses**: This [experimental feature][fact-check-section] composes
a follow-up prompt and asks the language model to “fact-check” its own previous response.
(See the [Using a language model to fact-check its own response][fact-check-section]
section.)
- **Generate related questions**: In addition to displaying a response to the user
question, the web UI displays 5 questions generated by the language model based on
the context of the user question. (See the
[Using a language model to suggest related questions][related-questions-section]
section.)
- **Return URLs of documentation sources**: Docs Agent's vector database stores URLs
as metadata next to embeddings. Whenever the vector database is used to retrieve
text chunks for context, the database can also return the URLs of the sources used
to generate the embeddings.
- **Collect feedback from users**: Docs Agent's chatbot web UI includes buttons that
allow users to [like generated responses][like-generated-responses] or
[submit rewrites][submit-a-rewrite].
- **Generate related questions**: In addition to answering a question, Docs Agent can
[suggest related questions][related-questions-section] based on the context of the
question.
- **Return URLs of source documents**: URLs are stored as chunks' metadata. This enables
Docs Agent to return the URLs of the source documents.
- **Collect feedback from users**: Docs Agent's web app has buttons that allow users
to [like responses][like-generated-responses] or [submit rewrites][submit-a-rewrite].
- **Convert Google Docs, PDF, and Gmail into Markdown files**: This feature uses
Apps Script to convert Google Docs, PDF, and Gmail into Markdown files, which then
can be used as input datasets for Docs Agent. (See the
[`apps_script`][apps-script-readme] directory.)
- **Run benchmark test to monitor the quality of AI-generated responses**: Using
Docs Agent, you can run [benchmark test][benchmark-test] to measure and compare
the quality of text chunks, embeddings, and AI-generated responses.
- **Use the Semantic Retrieval API and AQA model**: You can use Gemini's
[Semantic Retrieval API][semantic-api] to upload source documents to an online
corpus and use the [AQA model][aqa-model] that is specifically created for answering
questions using an online corpus.
- **Manage online corpora using the Docs Agent CLI**: The Docs Agent CLI enables you
to create, populate, update and delete online corpora using the Semantic Retrieval AI.
For the list of all available Docs Agent command lines, see the
[Docs Agent CLI reference][cli-reference] page.
- **Run the Docs Agent CLI from anywhere in a terminal**: You can set up the
Docs Agent CLI to ask questions to the Gemini model from anywhere in a terminal.
For more information, see the [Set up Docs Agent CLI][cli-readme] page.
- **Support the Gemini 1.5 models**: You can use the new Gemini 1.5 models,
`gemini-1.5-pro-latest` and `text-embedding-004`, with Docs Agent today.
For the moment, the following `config.yaml` setup is recommended:

```
models:
- language_model: "models/aqa"
embedding_model: "models/text-embedding-004"
api_endpoint: "generativelanguage.googleapis.com"
...
app_mode: "1.5"
db_type: "chroma"
```

The setup above uses 3 Gemini models to their strength: AQA (`aqa`),
Gemini 1.0 Pro (`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).
[Apps Script][apps-script-readme] to convert Google Docs, PDF, and Gmail into
Markdown files, which then can be used as input datasets for Docs Agent.
- **Run benchmark test**: Docs Agent can [run benchmark test][benchmark-test] to measure
and compare the quality of text chunks, embeddings, and AI-generated responses.
- **Use the Semantic Retrieval API and AQA model**: Docs Agent can use Gemini's
[Semantic Retrieval API][semantic-api] to upload source documents to online corpora
and use the [AQA model][aqa-model] for answering questions.
- **Manage online corpora using the Docs Agent CLI**: The [Docs Agent CLI][cli-reference]
lets you create, update and delete online corpora using the Semantic Retrieval AI.
- **Prevent duplicate chunks and delete obsolete chunks in databases**: Docs Agent
uses [metadata in chunks][chunking-process] to prevent uploading duplicate chunks
and delete obsolete chunks that are no longer present in the source.
- **Run the Docs Agent CLI from anywhere in a terminal**:
[Set up the Docs Agent CLI][cli-readme] to make requests to the Gemini models
from anywhere in a terminal.
- **Support the Gemini 1.5 models**: Docs Agent works with the Gemini 1.5 models,
`gemini-1.5-pro-latest` and `text-embedding-004`. The new ["1.5"][new-15-mode] web app
mode uses all three Gemini models to their strength: AQA (`aqa`), Gemini 1.0 Pro
(`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).

For more information on Docs Agent's architecture and features,
see the [Docs Agent concepts][docs-agent-concepts] page.
Expand Down Expand Up @@ -122,26 +99,26 @@ Update your host machine's environment to prepare for the Docs Agent setup:

1. Update the Linux package repositories on the host machine:

```posix-terminal
```
sudo apt update
```

2. Install the following dependencies:

```posix-terminal
```
sudo apt install git pipx python3-venv
```

3. Install `poetry`:

```posix-terminal
```
pipx install poetry
```

4. To add `$HOME/.local/bin` to your `PATH` variable, run the following
command:

```posix-terminal
```
pipx ensurepath
```

Expand All @@ -157,7 +134,7 @@ Update your host machine's environment to prepare for the Docs Agent setup:

6. Update your environment:

```posix-termainl
```
source ~/.bashrc
```

Expand Down Expand Up @@ -202,25 +179,25 @@ Clone the Docs Agent project and install dependencies:

1. Clone the following repo:

```posix-terminal
```
git clone https://github.com/google/generative-ai-docs.git
```

2. Go to the Docs Agent project directory:

```posix-terminal
```
cd generative-ai-docs/examples/gemini/python/docs-agent
```

3. Install dependencies using `poetry`:

```posix-terminal
```
poetry install
```

4. Enter the `poetry` shell environment:

```posix-terminal
```
poetry shell
```

Expand Down Expand Up @@ -437,3 +414,5 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
[oauth-client]: https://ai.google.dev/docs/oauth_quickstart#set-cloud
[cli-readme]: docs_agent/interfaces/README.md
[cli-reference]: docs/cli-reference.md
[chunking-process]: docs/chunking-process.md
[new-15-mode]: docs/config-reference.md#app_mode
68 changes: 68 additions & 0 deletions examples/gemini/python/docs-agent/docs/chunking-process.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Docs Agent chunking process

This page describes Docs Agent's chunking process and potential optimizations.

Currently, Docs Agent utilizes Markdown headings (`#`, `##`, and `###`) to
split documents into smaller, manageable chunks. However, the Docs Agent team
is actively developing more advanced strategies to improve the quality and
relevance of these chunks for retrieval.

## Chunking technique

In Retrieval Augmented Generation ([RAG][rag]) based systems, ensuring each
chunk contains the right information and context is crucial for accurate
retrieval. The goal of an effective chunking process is to ensure that each
chunk encapsulates a focused topic, which enhances the accuracy of retrieval
and ultimately leads to better answers. At the same time, the Docs Agent team
acknowledges the importance of a flexible approach that allows for
customization based on specific datasets and use cases.

Key characteristics in Docs Agent’s chunking process include:

- **Docs Agent splits documents based on Markdown headings.** However,
this approach has limitations, especially when dealing with large sections.
- **Docs Agent chunks are smaller than 5000 bytes (characters).** This size
limit is set by the embedding model used in generating embeddings.
- **Docs Agent enhances chunks with additional metadata.** The metadata helps
Docs Agent to execute operations efficiently, such as preventing duplicate
chunks in databases and deleting obsolete chunks that are no longer
present in the source.
- **Docs Agent retrieves the top 5 chunks and displays the top chunk's URL.**
However, this is adjustable in Docs Agent’s configuration (see the `widget`
and `experimental` app modes).

The Docs Agent team continues to explore various optimizations to enhance
the functionality and effectiveness of the chunking process. These efforts
include refining the chunking algorithm itself and developing advanced
post-processing techniques, for instance, reconstructing chunks to original
documents after retrieval.

Additionally, the team has been exploring methods for co-optimizing content
structure and chunking strategies, which aims to maximize retrieval
effectiveness by ensuring the structure of the source document itself
complements the chunking process.

## Chunks retrieval

Docs Agent employs two distinct approaches for storing and retrieving chunks:

- **The local database approach uses a [Chroma][chroma] vector database.**
This approach grants greater control over the chunking and retrieval
process. This option is recommended for development and experimental
setups.
- **The online corpus approach uses Gemini’s
[Semantic Retrieval API][semantic-retrieval].** This approach provides
the advantages of centrally hosted online databases, ensuring
accessibility for all users throughout the organization. This approach
has some drawbacks, as control is reduced because the API may dictate
how chunks are selected and where customization can be applied.

Choosing between these approaches depends on the specific needs of the user’s
deployment situation, which is to balance control and transparency against
possible improvements in performance, broader reach and ease of use.

<!-- Reference links -->

[rag]: concepts.md
[chroma]: https://docs.trychroma.com/
[semantic-retrieval]: https://ai.google.dev/gemini-api/docs/semantic_retrieval
63 changes: 57 additions & 6 deletions examples/gemini/python/docs-agent/docs/cli-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,15 @@
This page provides a list of the Docs Agent command lines and their usages
and examples.

**Important**: All `agent` commands in this page need to run in the
`poetry shell` environment.
The Docs Agent CLI helps developers to manage the Docs Agent project and
interact with language models. It can handle various tasks such as
processing documents, populating vector databases, launching the chatbot,
running benchmark test, sending prompts to language models, and more.

## Processing of Markdown files
**Important**: All `agent` commands need to run in the `poetry shell`
environment.

## Processing documents

### Chunk Markdown files into small text chunks

Expand Down Expand Up @@ -53,7 +58,16 @@ The command below deletes development databases specified in the
agent cleanup-dev
```

## Docs Agent chatbot web app
### Write logs to a CSV file

The command below writes the summaries of all captured debugging information
(in the `logs/debugs` directory) to a `.csv` file:

```sh
agent write-logs-to-csv
```

## Launching the chatbot web app

### Launch the Docs Agent web app

Expand Down Expand Up @@ -89,7 +103,7 @@ a log view page (which is accessible at `<APP_URL>/logs`):
agent chatbot --enable_show_logs
```

## Docs Agent benchmark test
## Running benchmark test

### Run the Docs Agent benchmark test

Expand Down Expand Up @@ -158,7 +172,44 @@ absolure or relative path, for example:
agent helpme write comments for this C++ file? --file ../my-project/test.cc
```

## Online corpus management
### Ask for advice in a session

The command below starts a new session (`--new`), which tracks responses,
before running the `agent helpme` command:

```sh
agent helpme <REQUEST> --file <PATH_TO_FILE> --new
```

For example:

```sh
agent helpme write a draft of all features found in this README file? --file ./README.md --new
```

After starting a session, use the `--cont` flag to include the previous
responses as context to the request:

```sh
agent helpme <REQUEST> --cont
```

For example:

```sh
agent helpme write a concept doc that delves into more details of these features? --cont
```

### Ask for advice using RAG

The command below uses a local or online vector database (specified in
the `config.yaml` file) to retrieve relevant context for the request:

```sh
agent helpme <REQUEST> --file <PATH_TO_FILE> --rag
```

## Managing online corpora

### List all existing online corpora

Expand Down
Loading
Loading