Skip to content

Commit

Permalink
x
Browse files Browse the repository at this point in the history
  • Loading branch information
eyurtsev committed Oct 15, 2024
1 parent 74376d1 commit cb27158
Show file tree
Hide file tree
Showing 4 changed files with 47 additions and 23 deletions.
15 changes: 11 additions & 4 deletions docs/docs/concepts/chat_models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ In addition, some chat models offer additional capabilities:

LangChain provides a consistent interface for working with chat models from different providers while offering additional features for monitoring, debugging, and optimizing the performance of applications that use LLMs.

* Integrations with many chat model providers (e.g., Anthropic, OpenAI, Ollama, Cohere, Hugging Face, Groq, Microsoft Azure, Google Vertex, Amazon Bedrock). Please see [chat model integrations](/docs/integrations/chat_models/) for an up-to-date list of supported models.
* Integrations with many chat model providers (e.g., Anthropic, OpenAI, Ollama, Cohere, Hugging Face, Groq, Microsoft Azure, Google Vertex, Amazon Bedrock). Please see [chat model integrations](/docs/integrations/chats/) for an up-to-date list of supported models.
* Use either LangChain's [messages](/docs/concepts/messages) format or OpenAI format.
* Standard [tool calling API](/docs/concepts#tool-calling): standard interface for binding tools to models, accessing tool call requests made by models, and sending tool results back to the model.
* Provides support for [async programming](/docs/concepts/async), [efficient batching](/docs/concepts/runnables#batch), [a rich streaming API](/docs/concepts/streaming).
Expand All @@ -31,7 +31,7 @@ LangChain chat models fall into two categories:

LangChain chat models are named with a convention that prefixes "Chat" to their class names (e.g., `ChatOllama`, `ChatAnthropic`, `ChatOpenAI`, etc.).

Please review the [chat model integrations](/docs/integrations/chat_models/) for a list of supported models.
Please review the [chat model integrations](/docs/integrations/chat/) for a list of supported models.

:::note
Models that do **not** include `Chat` or include "LLM" as a suffix in their name typically refer to older models that do not follow the chat model interface and
Expand Down Expand Up @@ -96,6 +96,12 @@ LangChain supports two message formats to interact with chat models:
Chat models can call [tools](/docs/concepts/tools) to perform tasks such as fetching data from a database, making API requests, or running custom code. Please
see the [tool calling](/docs/concepts#tool-calling) guide for more information.

## Structured Outputs

Chat models can be requested to respond in a particular format (e.g., JSON or matching a particular schema). This feature is extremely
useful for information extraction tasks. Please read more about
the technique in the [structured outputs](/docs/concepts#structured_output) guide.

## Multimodality

Large Language Models (LLMs) are not limited to processing text. They can also be used to process other types of data, such as images, audio, and video. This is known as [multimodality](/docs/concepts/multimodality).
Expand All @@ -108,7 +114,7 @@ A chat model's context window refers to the maximum size of the input sequence t

If the input exceeds the context window, the model may not be able to process the entire input and could raise an error. In conversational applications, this is especially important because the context window determines how much information the model can "remember" throughout a conversation. Developers often need to manage the input within the context window to maintain a coherent dialogue without exceeding the limit. For more details on handling memory in conversations, refer to the [memory](/docs/concepts/memory).

The size of the input is measured in **tokens** which are the unit of processing that the model uses. Read the [tokenization](/docs/concepts#tokenization) guide for more information on tokenization and tokens.
The size of the input is measured in [tokens](/docs/concepts/tokens) which are the unit of processing that the model uses.

## Advanced Topics

Expand Down Expand Up @@ -145,14 +151,15 @@ Please see the [how to cache chat model responses](/docs/how_to/#chat-model-cach
## Related Resources

* How-to guides on using chat models: [how-to guides](/docs/how_to/#chat-models).
* List of supported chat models: [chat model integrations](/docs/integrations/chat_models/).
* List of supported chat models: [chat model integrations](/docs/integrations/chats/).

### Conceptual guides

* [Messages](/docs/concepts/messages)
* [Tool calling](/docs/concepts#tool-calling)
* [Multimodality](/docs/concepts/multimodality)
* [Structured outputs](/docs/concepts#structured_output)
* [Tokens](/docs/concepts/tokens)



20 changes: 1 addition & 19 deletions docs/docs/concepts/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -451,25 +451,7 @@ TODO(concepts): Add URL fragment

#### Tokens

The unit that most model providers use to measure input and output is via a unit called a **token**.
Tokens are the basic units that language models read and generate when processing or producing text.
The exact definition of a token can vary depending on the specific way the model was trained -
for instance, in English, a token could be a single word like "apple", or a part of a word like "app".

When you send a model a prompt, the words and characters in the prompt are encoded into tokens using a **tokenizer**.
The model then streams back generated output tokens, which the tokenizer decodes into human-readable text.
The below example shows how OpenAI models tokenize `LangChain is cool!`:

![](/img/tokenization.png)

You can see that it gets split into 5 different tokens, and that the boundaries between tokens are not exactly the same as word boundaries.

The reason language models use tokens rather than something more immediately intuitive like "characters"
has to do with how they process and understand text. At a high-level, language models iteratively predict their next generated output based on
the initial input and their previous generations. Training the model using tokens language models to handle linguistic
units (like words or subwords) that carry meaning, rather than individual characters, which makes it easier for the model
to learn and understand the structure of the language, including grammar and context.
Furthermore, using tokens can also improve efficiency, since the model processes fewer units of text compared to character-level processing.
* Conceptual Guide: [Tokens](/docs/concepts/tokens)

### Function/tool calling

Expand Down
13 changes: 13 additions & 0 deletions docs/docs/concepts/messages.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,18 @@
# Messages

:::info Pre-requisites

- [Chat Models](/docs/concepts/chat_models)
:::

Modern LLMs and use a message-based system to communicate with the outside world. This system allows for more flexibility and extensibility than the older callback-based system. Messages are objects that contain information about the input and output of a model, and can be passed between components in a LangChain pipeline.

## LangChain Message Format





## HumanMessage

## AIMessage
Expand Down
22 changes: 22 additions & 0 deletions docs/docs/concepts/tokens.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## Tokens

The unit that most model providers use to measure input and output is via a unit called a **token**.
Tokens are the basic units that language models read and generate when processing or producing text.
The exact definition of a token can vary depending on the specific way the model was trained -
for instance, in English, a token could be a single word like "apple", or a part of a word like "app".

When you send a model a prompt, the words and characters in the prompt are encoded into tokens using a **tokenizer**.
The model then streams back generated output tokens, which the tokenizer decodes into human-readable text.
The below example shows how OpenAI models tokenize `LangChain is cool!`:

![](/img/tokenization.png)

You can see that it gets split into 5 different tokens, and that the boundaries between tokens are not exactly the same as word boundaries.

The reason language models use tokens rather than something more immediately intuitive like "characters"
has to do with how they process and understand text. At a high-level, language models iteratively predict their next generated output based on
the initial input and their previous generations. Training the model using tokens language models to handle linguistic
units (like words or subwords) that carry meaning, rather than individual characters, which makes it easier for the model
to learn and understand the structure of the language, including grammar and context.
Furthermore, using tokens can also improve efficiency, since the model processes fewer units of text compared to character-level processing.

0 comments on commit cb27158

Please sign in to comment.