google · kyolee415 · May 10, 2024 · Mar 15, 2024 · Mar 29, 2024 · Mar 29, 2024
@@ -9,10 +9,10 @@ and project deployment.
 
 ## Overview
 
-Docs Agent apps use a technique known as Retrieval Augmented Generation (RAG), which allows
-you to bring your own documents as knowledge sources to AI language models. This approach
-helps the AI language models to generate relevant and accurate responses that are grounded
-in the information that you provide and control.
+Docs Agent apps use a technique known as Retrieval Augmented Generation (RAG), which
+allows you to bring your own documents as knowledge sources to AI language models.
+This approach helps the AI language models to generate relevant and accurate responses
+that are grounded in the information that you provide and control.
 
 ![Docs Agent architecture](docs/images/docs-agent-architecture-01.png)
 
@@ -26,66 +26,43 @@ the [Set up Docs Agent][set-up-docs-agent] section below.
 
 The following list summarizes the tasks and features supported by Docs Agent:
 
-- **Process Markdown**: Split Markdown files into small plain text files. (See the
-  Python scripts in the [`preprocess`][preprocess-dir] directory.)
-- **Generate embeddings**: Use an embedding model to process small plain text files
-  into embeddings, and store them in a vector database. (See the
-  [`populate_vector_database.py`][populate-vector-database] script.)
-- **Perform semantic search**: Compare embeddings in the vector database to retrieve
-  most relevant content given user questions.
-- **Add context to a user question**: Add a list of text chunks returned from
-  a semantic search as context in a prompt. (See the
-  [Structure of a prompt to a language model][prompt-structure] section.)
-- **(Experimental) “Fact-check” responses**: This experimental feature composes
+- **Process Markdown**: Split Markdown files into small plain text chunks. (See
+  [Docs Agent chunking process][chunking-process].)
+- **Generate embeddings**: Use an embedding model to process text chunks into embeddings
+  and store them in a vector database.
+- **Perform semantic search**: Compare embeddings in a vector database to retrieve
+  chunks that are most relevant to user questions.
+- **Add context to a user question**: Add chunks returned from a semantic search as
+  [context][prompt-structure] to a prompt.
+- **Fact-check responses**: This [experimental feature][fact-check-section] composes
   a follow-up prompt and asks the language model to “fact-check” its own previous response.
-  (See the [Using a language model to fact-check its own response][fact-check-section]
-  section.)
-- **Generate related questions**: In addition to displaying a response to the user
-  question, the web UI displays 5 questions generated by the language model based on
-  the context of the user question. (See the
-  [Using a language model to suggest related questions][related-questions-section]
-  section.)
-- **Return URLs of documentation sources**: Docs Agent's vector database stores URLs
-  as metadata next to embeddings. Whenever the vector database is used to retrieve
-  text chunks for context, the database can also return the URLs of the sources used
-  to generate the embeddings.
-- **Collect feedback from users**: Docs Agent's chatbot web UI includes buttons that
-  allow users to [like generated responses][like-generated-responses] or
-  [submit rewrites][submit-a-rewrite].
+- **Generate related questions**: In addition to answering a question, Docs Agent can
+  [suggest related questions][related-questions-section] based on the context of the
+  question.
+- **Return URLs of source documents**: URLs are stored as chunks' metadata. This enables
+  Docs Agent to return the URLs of the source documents.
+- **Collect feedback from users**: Docs Agent's web app has buttons that allow users
+  to [like responses][like-generated-responses] or [submit rewrites][submit-a-rewrite].
 - **Convert Google Docs, PDF, and Gmail into Markdown files**: This feature uses
-  Apps Script to convert Google Docs, PDF, and Gmail into Markdown files, which then
-  can be used as input datasets for Docs Agent. (See the
-  [`apps_script`][apps-script-readme] directory.)
-- **Run benchmark test to monitor the quality of AI-generated responses**: Using
-  Docs Agent, you can run [benchmark test][benchmark-test] to measure and compare
-  the quality of text chunks, embeddings, and AI-generated responses.
-- **Use the Semantic Retrieval API and AQA model**: You can use Gemini's
-  [Semantic Retrieval API][semantic-api] to upload source documents to an online
-  corpus and use the [AQA model][aqa-model] that is specifically created for answering
-  questions using an online corpus.
-- **Manage online corpora using the Docs Agent CLI**: The Docs Agent CLI enables you
-  to create, populate, update and delete online corpora using the Semantic Retrieval AI.
-  For the list of all available Docs Agent command lines, see the
-  [Docs Agent CLI reference][cli-reference] page.
-- **Run the Docs Agent CLI from anywhere in a terminal**: You can set up the
-  Docs Agent CLI to ask questions to the Gemini model from anywhere in a terminal.
-  For more information, see the [Set up Docs Agent CLI][cli-readme] page.
-- **Support the Gemini 1.5 models**: You can use the new Gemini 1.5 models,
-  `gemini-1.5-pro-latest` and `text-embedding-004`, with Docs Agent today.
-  For the moment, the following `config.yaml` setup is recommended:
-
-  ```
-  models:
-  - language_model: "models/aqa"
-    embedding_model: "models/text-embedding-004"
-    api_endpoint: "generativelanguage.googleapis.com"
-  ...
-  app_mode: "1.5"
-  db_type: "chroma"
-  ```
-
-  The setup above uses 3 Gemini models to their strength: AQA (`aqa`),
-  Gemini 1.0 Pro (`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).
+  [Apps Script][apps-script-readme] to convert Google Docs, PDF, and Gmail into
+  Markdown files, which then can be used as input datasets for Docs Agent.
+- **Run benchmark test**: Docs Agent can [run benchmark test][benchmark-test] to measure
+  and compare the quality of text chunks, embeddings, and AI-generated responses.
+- **Use the Semantic Retrieval API and AQA model**: Docs Agent can use Gemini's
+  [Semantic Retrieval API][semantic-api] to upload source documents to online corpora
+  and use the [AQA model][aqa-model] for answering questions.
+- **Manage online corpora using the Docs Agent CLI**: The [Docs Agent CLI][cli-reference]
+  lets you create, update and delete online corpora using the Semantic Retrieval AI.
+- **Prevent duplicate chunks and delete obsolete chunks in databases**: Docs Agent
+  uses [metadata in chunks][chunking-process] to prevent uploading duplicate chunks
+  and delete obsolete chunks that are no longer present in the source.
+- **Run the Docs Agent CLI from anywhere in a terminal**:
+  [Set up the Docs Agent CLI][cli-readme] to make requests to the Gemini models
+  from anywhere in a terminal.
+- **Support the Gemini 1.5 models**: Docs Agent works with the Gemini 1.5 models,
+  `gemini-1.5-pro-latest` and `text-embedding-004`. The new ["1.5"][new-15-mode] web app
+  mode uses all three Gemini models to their strength: AQA (`aqa`), Gemini 1.0 Pro
+  (`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).
 
 For more information on Docs Agent's architecture and features,
 see the [Docs Agent concepts][docs-agent-concepts] page.
@@ -122,26 +99,26 @@ Update your host machine's environment to prepare for the Docs Agent setup:
 
 1. Update the Linux package repositories on the host machine:
 
-   ```posix-terminal
+   ```
    sudo apt update
    ```
 
 2. Install the following dependencies:
 
-   ```posix-terminal
+   ```
    sudo apt install git pipx python3-venv
    ```
 
 3. Install `poetry`:
 
-   ```posix-terminal
+   ```
    pipx install poetry
    ```
 
 4. To add `$HOME/.local/bin` to your `PATH` variable, run the following
    command:
 
-   ```posix-terminal
+   ```
    pipx ensurepath
    ```
 
@@ -157,7 +134,7 @@ Update your host machine's environment to prepare for the Docs Agent setup:
 
 6. Update your environment:
 
-   ```posix-termainl
+   ```
    source ~/.bashrc
    ```
 
@@ -202,25 +179,25 @@ Clone the Docs Agent project and install dependencies:
 
 1. Clone the following repo:
 
-   ```posix-terminal
+   ```
    git clone https://github.com/google/generative-ai-docs.git
    ```
 
 2. Go to the Docs Agent project directory:
 
-   ```posix-terminal
+   ```
    cd generative-ai-docs/examples/gemini/python/docs-agent
    ```
 
 3. Install dependencies using `poetry`:
 
-   ```posix-terminal
+   ```
    poetry install
    ```
 
 4. Enter the `poetry` shell environment:
 
-   ```posix-terminal
+   ```
    poetry shell
    ```
 
@@ -437,3 +414,5 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
 [oauth-client]: https://ai.google.dev/docs/oauth_quickstart#set-cloud
 [cli-readme]: docs_agent/interfaces/README.md
 [cli-reference]: docs/cli-reference.md
+[chunking-process]: docs/chunking-process.md
+[new-15-mode]: docs/config-reference.md#app_mode
@@ -0,0 +1,68 @@
+# Docs Agent chunking process
+
+This page describes Docs Agent's chunking process and potential optimizations.
+
+Currently, Docs Agent utilizes Markdown headings (`#`, `##`, and `###`) to
+split documents into smaller, manageable chunks. However, the Docs Agent team
+is actively developing more advanced strategies to improve the quality and
+relevance of these chunks for retrieval.
+
+## Chunking technique
+
+In Retrieval Augmented Generation ([RAG][rag]) based systems, ensuring each
+chunk contains the right information and context is crucial for accurate
+retrieval. The goal of an effective chunking process is to ensure that each
+chunk encapsulates a focused topic, which enhances the accuracy of retrieval
+and ultimately leads to better answers. At the same time, the Docs Agent team
+acknowledges the importance of a flexible approach that allows for
+customization based on specific datasets and use cases.
+
+Key characteristics in Docs Agent’s chunking process include:
+
+- **Docs Agent splits documents based on Markdown headings.** However,
+  this approach has limitations, especially when dealing with large sections.
+- **Docs Agent chunks are smaller than 5000 bytes (characters).** This size
+  limit is set by the embedding model used in generating embeddings.
+- **Docs Agent enhances chunks with additional metadata.** The metadata helps
+  Docs Agent to execute operations efficiently, such as preventing duplicate
+  chunks in databases and deleting obsolete chunks that are  no longer
+  present in the source.
+- **Docs Agent retrieves the top 5 chunks and displays the top chunk's URL.**
+  However, this is adjustable in Docs Agent’s configuration (see the `widget`
+  and `experimental` app modes).
+
+The Docs Agent team continues to explore various optimizations to enhance
+the functionality and effectiveness of the chunking process. These efforts
+include refining the chunking algorithm itself and developing advanced
+post-processing techniques, for instance, reconstructing chunks to original
+documents after retrieval.
+
+Additionally, the team has been exploring methods for co-optimizing content
+structure and chunking strategies, which aims to maximize retrieval
+effectiveness by ensuring the structure of the source document itself
+complements the chunking process.
+
+## Chunks retrieval
+
+Docs Agent employs two distinct approaches for storing and retrieving chunks:
+
+- **The local database approach uses a [Chroma][chroma] vector database.**
+  This approach grants greater control over the chunking and retrieval
+  process. This option is recommended for development and experimental
+  setups.
+- **The online corpus approach uses Gemini’s
+  [Semantic Retrieval API][semantic-retrieval].** This approach provides
+  the advantages of centrally hosted online databases, ensuring
+  accessibility for all users throughout the organization. This approach
+  has some drawbacks, as control is reduced because the API may dictate
+  how chunks are selected and where customization can be applied.
+
+Choosing between these approaches depends on the specific needs of the user’s
+deployment situation, which is to balance control and transparency against
+possible improvements in performance, broader reach and ease of use.
+
+<!-- Reference links -->
+
+[rag]: concepts.md
+[chroma]: https://docs.trychroma.com/
+[semantic-retrieval]: https://ai.google.dev/gemini-api/docs/semantic_retrieval
@@ -3,10 +3,15 @@
 This page provides a list of the Docs Agent command lines and their usages
 and examples.
 
-**Important**: All `agent` commands in this page need to run in the
-`poetry shell` environment.
+The Docs Agent CLI helps developers to manage the Docs Agent project and
+interact with language models. It can handle various tasks such as
+processing documents, populating vector databases, launching the chatbot,
+running benchmark test, sending prompts to language models, and more.
 
-## Processing of Markdown files
+**Important**: All `agent` commands need to run in the `poetry shell`
+environment.
+
+## Processing documents
 
 ### Chunk Markdown files into small text chunks
 
@@ -53,7 +58,16 @@ The command below deletes development databases specified in the
 agent cleanup-dev
 ```
 
-## Docs Agent chatbot web app
+### Write logs to a CSV file
+
+The command below writes the summaries of all captured debugging information
+(in the `logs/debugs` directory) to  a `.csv` file:
+
+```sh
+agent write-logs-to-csv
+```
+
+## Launching the chatbot web app
 
 ### Launch the Docs Agent web app
 
@@ -89,7 +103,7 @@ a log view page (which is accessible at `<APP_URL>/logs`):
 agent chatbot --enable_show_logs
 ```
 
-## Docs Agent benchmark test
+## Running benchmark test
 
 ### Run the Docs Agent benchmark test
 
@@ -158,7 +172,44 @@ absolure or relative path, for example:
 agent helpme write comments for this C++ file? --file ../my-project/test.cc
 ```
 
-## Online corpus management
+### Ask for advice in a session
+
+The command below starts a new session (`--new`), which tracks responses,
+before running the `agent helpme` command:
+
+```sh
+agent helpme <REQUEST> --file <PATH_TO_FILE> --new
+```
+
+For example:
+
+```sh
+agent helpme write a draft of all features found in this README file? --file ./README.md --new
+```
+
+After starting a session, use the `--cont` flag to include the previous
+responses as context to the request:
+
+```sh
+agent helpme <REQUEST> --cont
+```
+
+For example:
+
+```sh
+agent helpme write a concept doc that delves into more details of these features? --cont
+```
+
+### Ask for advice using RAG
+
+The command below uses a local or online vector database (specified in
+the `config.yaml` file) to retrieve relevant context for the request:
+
+```sh
+agent helpme <REQUEST> --file <PATH_TO_FILE> --rag
+```
+
+## Managing online corpora
 
 ### List all existing online corpora