Skip to content

Commit

Permalink
DOCS: Concept Section Improvements & Updates (#27733)
Browse files Browse the repository at this point in the history
Edited mainly the `Concepts` section in the LangChain documentation.

Overview:
* Updated some explanations to make the point more clear / Add missing
words for some documentations.
* Rephrased some sentences to make it shorter and more concise.

---------

Co-authored-by: Eugene Yurtsev <[email protected]>
Co-authored-by: Eugene Yurtsev <[email protected]>
  • Loading branch information
3 people authored Nov 13, 2024
1 parent 02de346 commit da7c79b
Show file tree
Hide file tree
Showing 10 changed files with 31 additions and 30 deletions.
2 changes: 1 addition & 1 deletion docs/docs/concepts/chat_history.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Most conversations start with a **system message** that sets the context for the

The **assistant** may respond directly to the user or if configured with tools request that a [tool](/docs/concepts/tool_calling) be invoked to perform a specific task.

So a full conversation often involves a combination of two patterns of alternating messages:
A full conversation often involves a combination of two patterns of alternating messages:

1. The **user** and the **assistant** representing a back-and-forth conversation.
2. The **assistant** and **tool messages** representing an ["agentic" workflow](/docs/concepts/agents) where the assistant is invoking tools to perform specific tasks.
Expand Down
10 changes: 5 additions & 5 deletions docs/docs/concepts/chat_models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Overview

Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific tuning for every scenario.
Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific fine tuning for every scenario.

Modern LLMs are typically accessed through a chat model interface that takes a list of [messages](/docs/concepts/messages) as input and returns a [message](/docs/concepts/messages) as output.

Expand Down Expand Up @@ -85,7 +85,7 @@ Many chat models have standardized parameters that can be used to configure the
| Parameter | Description |
|----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `model` | The name or identifier of the specific AI model you want to use (e.g., `"gpt-3.5-turbo"` or `"gpt-4"`). |
| `temperature` | Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.1) makes them more deterministic and focused. |
| `temperature` | Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.0) makes them more deterministic and focused. |
| `timeout` | The maximum time (in seconds) to wait for a response from the model before canceling the request. Ensures the request doesn’t hang indefinitely. |
| `max_tokens` | Limits the total number of tokens (words and punctuation) in the response. This controls how long the output can be. |
| `stop` | Specifies stop sequences that indicate when the model should stop generating tokens. For example, you might use specific strings to signal the end of a response. |
Expand All @@ -97,9 +97,9 @@ Many chat models have standardized parameters that can be used to configure the
Some important things to note:

- Standard parameters only apply to model providers that expose parameters with the intended functionality. For example, some providers do not expose a configuration for maximum output tokens, so max_tokens can't be supported on these.
- Standard params are currently only enforced on integrations that have their own integration packages (e.g. `langchain-openai`, `langchain-anthropic`, etc.), they're not enforced on models in ``langchain-community``.
- Standard parameters are currently only enforced on integrations that have their own integration packages (e.g. `langchain-openai`, `langchain-anthropic`, etc.), they're not enforced on models in `langchain-community`.

ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the [API reference](https://python.langchain.com/api_reference/) for that model.
Chat models also accept other parameters that are specific to that integration. To find all the parameters supported by a Chat model head to the their respective [API reference](https://python.langchain.com/api_reference/) for that model.

## Tool calling

Expand Down Expand Up @@ -150,7 +150,7 @@ An alternative approach is to use semantic caching, where you cache responses ba

A semantic cache introduces a dependency on another model on the critical path of your application (e.g., the semantic cache may rely on an [embedding model](/docs/concepts/embedding_models) to convert text to a vector representation), and it's not guaranteed to capture the meaning of the input accurately.

However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider and improve response times.
However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider, costs, and improve response times.

Please see the [how to cache chat model responses](/docs/how_to/chat_model_caching/) guide for more details.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/concepts/document_loaders.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ loader = CSVLoader(
data = loader.load()
```

or if working with large datasets, you can use the `.lazy_load` method:
When working with large datasets, you can use the `.lazy_load` method:

```python
for document in loader.lazy_load():
Expand Down
8 changes: 4 additions & 4 deletions docs/docs/concepts/lcel.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

The **L**ang**C**hain **E**xpression **L**anguage (LCEL) takes a [declarative](https://en.wikipedia.org/wiki/Declarative_programming) approach to building new [Runnables](/docs/concepts/runnables) from existing Runnables.

This means that you describe what you want to happen, rather than how you want it to happen, allowing LangChain to optimize the run-time execution of the chains.
This means that you describe what *should* happen, rather than *how* it should happen, allowing LangChain to optimize the run-time execution of the chains.

We often refer to a `Runnable` created using LCEL as a "chain". It's important to remember that a "chain" is `Runnable` and it implements the full [Runnable Interface](/docs/concepts/runnables).

Expand All @@ -20,8 +20,8 @@ We often refer to a `Runnable` created using LCEL as a "chain". It's important t

LangChain optimizes the run-time execution of chains built with LCEL in a number of ways:

- **Optimize parallel execution**: Run Runnables in parallel using [RunnableParallel](#runnableparallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables/#optimized-parallel-execution-batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially.
- **Guarantee Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables/#asynchronous-support). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently.
- **Optimized parallel execution**: Run Runnables in parallel using [RunnableParallel](#runnableparallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables/#optimized-parallel-execution-batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially.
- **Guaranteed Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables/#asynchronous-support). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently.
- **Simplify streaming**: LCEL chains can be streamed, allowing for incremental output as the chain is executed. LangChain can optimize the streaming of the output to minimize the time-to-first-token(time elapsed until the first chunk of output from a [chat model](/docs/concepts/chat_models) or [llm](/docs/concepts/text_llms) comes out).

Other benefits include:
Expand All @@ -38,7 +38,7 @@ LCEL is an [orchestration solution](https://en.wikipedia.org/wiki/Orchestration_

While we have seen users run chains with hundreds of steps in production, we generally recommend using LCEL for simpler orchestration tasks. When the application requires complex state management, branching, cycles or multiple agents, we recommend that users take advantage of [LangGraph](/docs/concepts/architecture#langgraph).

In LangGraph, users define graphs that specify the flow of the application. This allows users to keep using LCEL within individual nodes when LCEL is needed, while making it easy to define complex orchestration logic that is more readable and maintainable.
In LangGraph, users define graphs that specify the application's flow. This allows users to keep using LCEL within individual nodes when LCEL is needed, while making it easy to define complex orchestration logic that is more readable and maintainable.

Here are some guidelines:

Expand Down
3 changes: 2 additions & 1 deletion docs/docs/concepts/messages.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

Messages are the unit of communication in [chat models](/docs/concepts/chat_models). They are used to represent the input and output of a chat model, as well as any additional context or metadata that may be associated with a conversation.

Each message has a **role** (e.g., "user", "assistant"), **content** (e.g., text, multimodal data), and additional metadata that can vary depending on the chat model provider.
Each message has a **role** (e.g., "user", "assistant") and **content** (e.g., text, multimodal data) with additional metadata that varies depending on the chat model provider.

LangChain provides a unified message format that can be used across chat models, allowing users to work with different chat models without worrying about the specific details of the message format used by each model provider.

Expand Down Expand Up @@ -39,6 +39,7 @@ The content of a message text or a list of dictionaries representing [multimodal
Currently, most chat models support text as the primary content type, with some models also supporting multimodal data. However, support for multimodal data is still limited across most chat model providers.

For more information see:
* [SystemMessage](#systemmessage) -- for content which should be passed to direct the conversation
* [HumanMessage](#humanmessage) -- for content in the input from the user.
* [AIMessage](#aimessage) -- for content in the response from the model.
* [Multimodality](/docs/concepts/multimodality) -- for more information on multimodal content.
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/concepts/retrieval.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ These systems accommodate various data formats:
- Unstructured text (e.g., documents) is often stored in vector stores or lexical search indexes.
- Structured data is typically housed in relational or graph databases with defined schemas.

Despite this diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces.
Despite the growing diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces.
Models play a crucial role in this process by translating natural language queries into formats compatible with the underlying search index or database.
This translation enables more intuitive and flexible interactions with complex data structures.

Expand All @@ -41,7 +41,7 @@ This translation enables more intuitive and flexible interactions with complex d

## Query analysis

While users typically prefer to interact with retrieval systems using natural language, retrieval systems can specific query syntax or benefit from particular keywords.
While users typically prefer to interact with retrieval systems using natural language, these systems may require specific query syntax or benefit from certain keywords.
Query analysis serves as a bridge between raw user input and optimized search queries. Some common applications of query analysis include:

1. **Query Re-writing**: Queries can be re-written or expanded to improve semantic or lexical searches.
Expand Down
8 changes: 4 additions & 4 deletions docs/docs/concepts/runnables.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Runnable interface

The Runnable interface is foundational for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs](
The Runnable interface is the foundation for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs](
https://langchain-ai.github.io/langgraph/concepts/low_level/#compiling-your-graph) and more.

This guide covers the main concepts and methods of the Runnable interface, which allows developers to interact with various LangChain components in a consistent and predictable manner.
Expand Down Expand Up @@ -42,7 +42,7 @@ Some Runnables may provide their own implementations of `batch` and `batch_as_co
rely on a `batch` API provided by a model provider).

:::note
The async versions of `abatch` and `abatch_as_completed` these rely on asyncio's [gather](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) and [as_completed](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed) functions to run the `ainvoke` method in parallel.
The async versions of `abatch` and `abatch_as_completed` relies on asyncio's [gather](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) and [as_completed](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed) functions to run the `ainvoke` method in parallel.
:::

:::tip
Expand All @@ -58,7 +58,7 @@ Runnables expose an asynchronous API, allowing them to be called using the `awai

Please refer to the [Async Programming with LangChain](/docs/concepts/async) guide for more details.

## Streaming apis
## Streaming APIs
<span data-heading-keywords="streaming-api"></span>

Streaming is critical in making applications based on LLMs feel responsive to end-users.
Expand Down Expand Up @@ -101,7 +101,7 @@ This is an advanced feature that is unnecessary for most users. You should proba
skip this section unless you have a specific need to inspect the schema of a Runnable.
:::

In some advanced uses, you may want to programmatically **inspect** the Runnable and determine what input and output types the Runnable expects and produces.
In more advanced use cases, you may want to programmatically **inspect** the Runnable and determine what input and output types the Runnable expects and produces.

The Runnable interface provides methods to get the [JSON Schema](https://json-schema.org/) of the input and output types of a Runnable, as well as [Pydantic schemas](https://docs.pydantic.dev/latest/) for the input and output types.

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/concepts/structured_outputs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -119,11 +119,11 @@ json_object = json.loads(ai_msg.content)

There are a few challenges when producing structured output with the above methods:

(1) If using tool calling, tool call arguments needs to be parsed from a dictionary back to the original schema.
(1) When tool calling is used, tool call arguments needs to be parsed from a dictionary back to the original schema.

(2) In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting.

(3) If using JSON mode, the output needs to be parsed into a JSON object.
(3) When JSON mode is used, the output needs to be parsed into a JSON object.

With these challenges in mind, LangChain provides a helper function (`with_structured_output()`) to streamline the process.

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/concepts/tools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@

## Overview

The **tool** abstraction in LangChain associates a python **function** with a **schema** that defines the function's **name**, **description** and **input**.
The **tool** abstraction in LangChain associates a Python **function** with a **schema** that defines the function's **name**, **description** and **expected arguments**.

**Tools** can be passed to [chat models](/docs/concepts/chat_models) that support [tool calling](/docs/concepts/tool_calling) allowing the model to request the execution of a specific function with specific inputs.

## Key concepts

- Tools are a way to encapsulate a function and its schema in a way that can be passed to a chat model.
- Create tools using the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator, which simplifies the process of tool creation, supporting the following:
- Automatically infer the tool's **name**, **description** and **inputs**, while also supporting customization.
- Automatically infer the tool's **name**, **description** and **expected arguments**, while also supporting customization.
- Defining tools that return **artifacts** (e.g. images, dataframes, etc.)
- Hiding input arguments from the schema (and hence from the model) using **injected tool arguments**.

Expand Down
Loading

0 comments on commit da7c79b

Please sign in to comment.