diff --git a/docs/docs/concepts/agents.mdx b/docs/docs/concepts/agents.mdx
index 98c285341098f..960eb2a975d1e 100644
--- a/docs/docs/concepts/agents.mdx
+++ b/docs/docs/concepts/agents.mdx
@@ -1,19 +1,21 @@
 # Agents
 
-We recommend that you use [LangGraph](/docs/concepts/architecture#langgraph) for building agents.
+By themselves, language models can't take actions - they just output text. Agents are systems that take a high-level task and use an LLM as a reasoning engine to decide what actions to take and execute those actions.
+
+[LangGraph](/docs/concepts/architecture#langgraph) is an extension of LangChain specifically aimed at creating highly controllable and customizable agents. We recommend that you use LangGraph for building agents.
 
 Please see the following resources for more information:
 
-* LangGraph docs for conceptual architecture about [Agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)
-* [Pre-built agent in LangGraph](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.chat_agent_executor.create_react_agent)
+* LangGraph docs on [common agent architectures](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)
+* [Pre-built agents in LangGraph](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.chat_agent_executor.create_react_agent)
 
-## Legacy Agent Concept: AgentExecutor
+## Legacy agent concept: AgentExecutor
 
 LangChain previously introduced the `AgentExecutor` as a runtime for agents. 
 While it served as an excellent starting point, its limitations became apparent when dealing with more sophisticated and customized agents. 
 As a result, we're gradually phasing out `AgentExecutor` in favor of more flexible solutions in LangGraph.
 
-### Transitioning from AgentExecutor to LangGraph
+### Transitioning from AgentExecutor to langgraph
 
 If you're currently using `AgentExecutor`, don't worry! We've prepared resources to help you:
 
diff --git a/docs/docs/concepts/async.mdx b/docs/docs/concepts/async.mdx
index 754ef552c2a61..2a1d5acf57845 100644
--- a/docs/docs/concepts/async.mdx
+++ b/docs/docs/concepts/async.mdx
@@ -1,12 +1,10 @@
-# Async Programming with LangChain
+# Async programming with langchain
 
 :::info Prerequisites
-* [Runnable Interface](/docs/concepts/runnables)
-* [asyncio documentation](https://docs.python.org/3/library/asyncio.html)
+* [Runnable interface](/docs/concepts/runnables)
+* [asyncio](https://docs.python.org/3/library/asyncio.html)
 :::
 
-## Overview
-
 LLM based applications often involve a lot of I/O-bound operations, such as making API calls to language models, databases, or other services. Asynchronous programming (or async programming) is a paradigm that allows a program to perform multiple tasks concurrently without blocking the execution of other tasks, improving efficiency and responsiveness, particularly in I/O-bound operations.
 
 :::note
@@ -14,7 +12,7 @@ You are expected to be familiar with asynchronous programming in Python before r
 This guide specifically focuses on what you need to know to work with LangChain in an asynchronous context, assuming that you are already familiar with asynch
 :::
 
-## LangChain Asynchronous APIs
+## Langchain asynchronous apis
 
 Many LangChain APIs are designed to be asynchronous, allowing you to build efficient and responsive applications.
 
@@ -41,7 +39,7 @@ the full [Runnable Interface](/docs/concepts/runnables).
 
 Fore more information, please review the [API reference](https://python.langchain.com/api_reference/) for the specific component you are using.
 
-## Delegation to Sync Methods
+## Delegation to sync methods
 
 Most popular LangChain integrations implement asynchronous support of their APIs. For example, the `ainvoke` method of many ChatModel implementations uses the `httpx.AsyncClient` to make asynchronous HTTP requests to the model provider's API.
 
@@ -75,9 +73,9 @@ in certain scenarios.
 
 If you are experiencing issues with streaming, callbacks or tracing in async code and are using Python 3.9 or 3.10, this is a likely cause.
 
-Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-runnableconfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
+Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-RunnableConfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
 
-## How to use in IPython and Jupyter Notebooks
+## How to use in ipython and jupyter notebooks
 
 As of IPython 7.0, IPython supports asynchronous REPLs. This means that you can use the `await` keyword in the IPython REPL and Jupyter Notebooks without any additional setup. For more information, see the [IPython blog post](https://blog.jupyter.org/ipython-7-0-async-repl-a35ce050f7f7).
 
diff --git a/docs/docs/concepts/callbacks.mdx b/docs/docs/concepts/callbacks.mdx
index 997930554d94e..6e3975271d8eb 100644
--- a/docs/docs/concepts/callbacks.mdx
+++ b/docs/docs/concepts/callbacks.mdx
@@ -4,13 +4,11 @@
 - [Runnable interface](/docs/concepts/#runnable-interface)
 :::
 
-## Overview
-
 LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
 
 You can subscribe to these events by using the `callbacks` argument available throughout the API. This argument is list of handler objects, which are expected to implement one or more of the methods described below in more detail.
 
-## Callback Events
+## Callback events
 
 | Event            | Event Trigger                               | Associated Method     |
 |------------------|---------------------------------------------|-----------------------|
diff --git a/docs/docs/concepts/chat_history.mdx b/docs/docs/concepts/chat_history.mdx
index 54d5768d8c69b..967f93af968e0 100644
--- a/docs/docs/concepts/chat_history.mdx
+++ b/docs/docs/concepts/chat_history.mdx
@@ -1,17 +1,15 @@
-# Chat History
+# Chat history
 
 :::info Prerequisites
 
 - [Messages](/docs/concepts/messages)
-- [Chat Models](/docs/concepts/chat_models)
-- [Tool Calling](/docs/concepts/tool_calling)
+- [Chat models](/docs/concepts/chat_models)
+- [Tool calling](/docs/concepts/tool_calling)
 :::
 
-## Overview
-
 Chat history is a record of the conversation between the user and the chat model. It is used to maintain context and state throughout the conversation. The chat history is sequence of [messages](/docs/concepts/messages), each of which is associated with a specific [role](/docs/concepts/messages#role), such as "user", "assistant", "system", or "tool".
 
-## Conversation Patterns
+## Conversation patterns
 
 ![Conversation patterns](/img/conversation_patterns.png)
 
@@ -24,7 +22,7 @@ So a full conversation often involves a combination of two patterns of alternati
 1. The **user** and the **assistant** representing a back-and-forth conversation.
 2. The **assistant** and **tool messages** representing an ["agentic" workflow](/docs/concepts/agents) where the assistant is invoking tools to perform specific tasks.
 
-## Managing Chat History
+## Managing chat history
 
 Since chat models have a maximum limit on input size, it's important to manage chat history and trim it as needed to avoid exceeding the [context window](/docs/concepts/chat_models#context_window).
 
@@ -42,7 +40,7 @@ Understanding correct conversation structure is essential for being able to prop
 [memory](https://langchain-ai.github.io/langgraph/concepts/memory/) in chat models.
 :::
 
-## Related Resources
+## Related resources
 
-- [How to Trim Messages](https://python.langchain.com/docs/how_to/trim_messages/)
-- [Memory Guide](https://langchain-ai.github.io/langgraph/concepts/memory/) for information on implementing short-term and long-term memory in chat models using [LangGraph](https://langchain-ai.github.io/langgraph/).
+- [How to trim messages](https://python.langchain.com/docs/how_to/trim_messages/)
+- [Memory guide](https://langchain-ai.github.io/langgraph/concepts/memory/) for information on implementing short-term and long-term memory in chat models using [LangGraph](https://langchain-ai.github.io/langgraph/).
diff --git a/docs/docs/concepts/chat_models.mdx b/docs/docs/concepts/chat_models.mdx
index c528b8b6acf64..e924168e2cd71 100644
--- a/docs/docs/concepts/chat_models.mdx
+++ b/docs/docs/concepts/chat_models.mdx
@@ -1,21 +1,22 @@
-# Chat Models
+# Chat models
 
 ## Overview
 
 Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific tuning for every scenario.
 
-Modern LLMs are typically accessed through a chat model interface that takes [messages](/docs/concepts/messages) as input and returns [messages](/docs/concepts/messages) as output.
+Modern LLMs are typically accessed through a chat model interface that takes a list of [messages](/docs/concepts/messages) as input and returns a [message](/docs/concepts/messages) as output.
 
 The newest generation of chat models offer additional capabilities:
 
-* [Tool Calling](/docs/concepts#tool-calling): Many popular chat models offer a native [tool calling](/docs/concepts#tool-calling) API. This API allows developers to build rich applications that enable AI to interact with external services, APIs, and databases. Tool calling can also be used to extract structured information from unstructured data and perform various other tasks.
+* [Tool calling](/docs/concepts#tool-calling): Many popular chat models offer a native [tool calling](/docs/concepts#tool-calling) API. This API allows developers to build rich applications that enable AI to interact with external services, APIs, and databases. Tool calling can also be used to extract structured information from unstructured data and perform various other tasks.
+* [Structured output](/docs/concepts/structured_outputs): A technique to make a chat model respond in a structured format, such as JSON that matches a given schema.
 * [Multimodality](/docs/concepts/multimodality): The ability to work with data other than text; for example, images, audio, and video.
 
 ## Features
 
 LangChain provides a consistent interface for working with chat models from different providers while offering additional features for monitoring, debugging, and optimizing the performance of applications that use LLMs.
 
-* Integrations with many chat model providers (e.g., Anthropic, OpenAI, Ollama, Cohere, Hugging Face, Groq, Microsoft Azure, Google Vertex, Amazon Bedrock). Please see [chat model integrations](/docs/integrations/chat/) for an up-to-date list of supported models.
+* Integrations with many chat model providers (e.g., Anthropic, OpenAI, Ollama, Microsoft Azure, Google Vertex, Amazon Bedrock, Hugging Face, Cohere, Groq). Please see [chat model integrations](/docs/integrations/chat/) for an up-to-date list of supported models.
 * Use either LangChain's [messages](/docs/concepts/messages) format or OpenAI format.
 * Standard [tool calling API](/docs/concepts#tool-calling): standard interface for binding tools to models, accessing tool call requests made by models, and sending tool results back to the model.
 * Standard API for structuring outputs (/docs/concepts/structured_outputs) via the `with_structured_output` method.
@@ -23,14 +24,14 @@ LangChain provides a consistent interface for working with chat models from diff
 * Integration with [LangSmith](https://docs.smith.langchain.com) for monitoring and debugging production-grade applications based on LLMs.
 * Additional features like standardized [token usage](/docs/concepts/messages#token_usage), [rate limiting](#rate-limiting), [caching](#cache) and more.
 
-##  Available Integrations
+## Integrations
 
 LangChain has many chat model integrations that allow you to use a wide variety of models from different providers.
 
 These integrations are one of two types:
 
-1. **Official Models**: These are models that are officially supported by LangChain and/or model provider. You can find these models in the `langchain-<provider>` packages.
-2. **Community Models**: There are models that are mostly contributed and supported by the community. You can find these models in the `langchain-community` package.
+1. **Official models**: These are models that are officially supported by LangChain and/or model provider. You can find these models in the `langchain-<provider>` packages.
+2. **Community models**: There are models that are mostly contributed and supported by the community. You can find these models in the `langchain-community` package.
 
 LangChain chat models are named with a convention that prefixes "Chat" to their class names (e.g., `ChatOllama`, `ChatAnthropic`, `ChatOpenAI`, etc.).
 
@@ -56,7 +57,7 @@ However, LangChain also has implementations of older LLMs that do not follow the
 These models implement the [BaseLLM](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.llms.BaseLLM.html#langchain_core.language_models.llms.BaseLLM) interface and may be named with the "LLM" suffix (e.g., `OllamaLLM`, `AnthropicLLM`, `OpenAILLM`, etc.). Generally, users should not use these models.
 :::
 
-### Key Methods
+### Key methods
 
 The key methods of a chat model are:
 
@@ -68,7 +69,7 @@ The key methods of a chat model are:
 
 Other important methods can be found in the [BaseChatModel API Reference](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html).
 
-### Inputs and Outputs 
+### Inputs and outputs
 
 Modern LLMs are typically accessed through a chat model interface that takes [messages](/docs/concepts/messages) as input and returns [messages](/docs/concepts/messages) as output. Messages are typically associated with a role (e.g., "system", "human", "assistant") and one or more content blocks that contain text or potentially multimodal data (e.g., images, audio, video).
 
@@ -77,7 +78,7 @@ LangChain supports two message formats to interact with chat models:
 1. **LangChain Message Format**: LangChain's own message format, which is used by default and is used internally by LangChain.
 2. **OpenAI's Message Format**: OpenAI's message format.
 
-### Standard Parameters
+### Standard parameters
 
 Many chat models have standardized parameters that can be used to configure the model:
 
@@ -100,12 +101,12 @@ Some important things to note:
 
 ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the [API reference](https://python.langchain.com/api_reference/) for that model.
 
-## Tool Calling
+## Tool calling
 
 Chat models can call [tools](/docs/concepts/tools) to perform tasks such as fetching data from a database, making API requests, or running custom code. Please
 see the [tool calling](/docs/concepts#tool-calling) guide for more information.
 
-## Structured Outputs
+## Structured outputs
 
 Chat models can be requested to respond in a particular format (e.g., JSON or matching a particular schema). This feature is extremely
 useful for information extraction tasks. Please read more about
@@ -117,7 +118,7 @@ Large Language Models (LLMs) are not limited to processing text. They can also b
 
 Currently, only some LLMs support multimodal inputs, and almost none support multimodal outputs. Please consult the specific model documentation for details.
 
-## Context Window
+## Context window
 
 A chat model's context window refers to the maximum size of the input sequence the model can process at one time. While the context windows of modern LLMs are quite large, they still present a limitation that developers must keep in mind when working with chat models.
 
@@ -125,7 +126,7 @@ If the input exceeds the context window, the model may not be able to process th
 
 The size of the input is measured in [tokens](/docs/concepts/tokens) which are the unit of processing that the model uses.
 
-## Advanced Topics 
+## Advanced topics
  
 ### Rate-limiting
 
@@ -153,7 +154,7 @@ However, there might be situations where caching chat model responses is benefic
 
 Please see the [how to cache chat model responses](/docs/how_to/#chat-model-caching) guide for more details.
 
-## Related Resources
+## Related resources
 
 * How-to guides on using chat models: [how-to guides](/docs/how_to/#chat-models).
 * List of supported chat models: [chat model integrations](/docs/integrations/chat/).
diff --git a/docs/docs/concepts/document_loaders.mdx b/docs/docs/concepts/document_loaders.mdx
index b9ed0f165cad5..a6a11ddfe7104 100644
--- a/docs/docs/concepts/document_loaders.mdx
+++ b/docs/docs/concepts/document_loaders.mdx
@@ -3,14 +3,12 @@
 
 :::info[Prerequisites]
 
-* [Document API Reference](https://python.langchain.com/docs/how_to/#document-loaders)
+* [Document loaders API reference](https://python.langchain.com/docs/how_to/#document-loaders)
 :::
 
-## Overview
-
 Document loaders are designed to load document objects. LangChain has hundreds of integrations with various data sources to load data from: Slack, Notion, Google Drive, etc.
 
-## Available Integrations
+## Integrations
 
 You can find available integrations on the [Document Loaders Integrations page](https://python.langchain.com/docs/integrations/document_loaders/).
 
@@ -38,10 +36,10 @@ for document in loader.lazy_load():
     print(document)
 ```
 
-## Related Resources
+## Related resources
 
 Please see the following resources for more information:
 
 * [How-to guides for document loaders](https://python.langchain.com/docs/how_to/#document-loaders)
-* [Document API Reference](https://python.langchain.com/docs/how_to/#document-loaders)
-* [Document Loaders Integrations](https://python.langchain.com/docs/integrations/document_loaders/)
+* [Document API reference](https://python.langchain.com/docs/how_to/#document-loaders)
+* [Document loaders integrations](https://python.langchain.com/docs/integrations/document_loaders/)
diff --git a/docs/docs/concepts/embedding_models.mdx b/docs/docs/concepts/embedding_models.mdx
index 07488ecf4ff69..978188421c6fd 100644
--- a/docs/docs/concepts/embedding_models.mdx
+++ b/docs/docs/concepts/embedding_models.mdx
@@ -13,9 +13,7 @@ This conceptual overview focuses on text-based embedding models.
 Embedding models can also be [multimodal](/docs/concepts/multimodality) though such models are not currently supported by LangChain.
 :::
 
-## Overview
-
-Imagine being able to capture the essence of any text - a tweet, document, or book - in a single, compact representation. 
+Imagine being able to capture the essence of any text - a tweet, document, or book - in a single, compact representation.
 This is the power of embedding models, which lie at the heart of many retrieval systems.
 Embedding models transform human language into a format that machines can understand and compare with speed and accuracy. 
 These models take text as input and produce a fixed-length array of numbers, a numerical fingerprint of the text's semantic meaning.
@@ -49,7 +47,7 @@ To navigate this variety, researchers and practitioners often turn to benchmarks
 
 :::
 
-### LangChain Interface  
+### Interface
 
 LangChain provides a universal interface for working with them, providing standard methods for common operations.
 This common interface simplifies interaction with various embedding providers through two central methods:
@@ -89,7 +87,7 @@ query_embedding = embeddings_model.embed_query("What is the meaning of life?")
 
 :::
 
-### Available integrations
+### Integrations
 
 LangChain offers many embedding model integrations which you can find [on the embedding models](/docs/integrations/text_embedding/) integrations page.
 
diff --git a/docs/docs/concepts/example_selectors.mdx b/docs/docs/concepts/example_selectors.mdx
index c34cf6d6735cd..32dad8c5fa443 100644
--- a/docs/docs/concepts/example_selectors.mdx
+++ b/docs/docs/concepts/example_selectors.mdx
@@ -15,6 +15,6 @@ Sometimes these examples are hardcoded into the prompt, but for more advanced si
 
 **Example Selectors** are classes responsible for selecting and then formatting examples into prompts.
 
-## Related Resources
+## Related resources
 
 * [Example selector how-to guides](/docs/how_to/#example-selectors)
\ No newline at end of file
diff --git a/docs/docs/concepts/index.mdx b/docs/docs/concepts/index.mdx
index 7eb47479d9991..689db4b06a394 100644
--- a/docs/docs/concepts/index.mdx
+++ b/docs/docs/concepts/index.mdx
@@ -1,48 +1,44 @@
-# Conceptual Guide
+# Conceptual guide
 
-## Overview
+This guide provides explanations of the key concepts behind the LangChain framework and AI applications more broadly.
 
-In this guide, you'll find explanations of the key concepts, providing a deeper understanding of core principles.
+We recommend that you go through at least one of the [Tutorials](/docs/tutorials) before diving into the conceptual guide. This will provide practical context that will make it easier to understand the concepts discussed here.
 
-We recommend that you go through at least one of the [Tutorials](/docs/tutorials) before diving into the conceptual guide. This will help you understand the context and practical applications of the concepts discussed here.
+The conceptual guide does not cover step-by-step instructions or specific implementation examples — those are found in the [How-to guides](/docs/how_to/) and [Tutorials](/docs/tutorials). For detailed reference material, please see the [API reference](https://python.langchain.com/api_reference/).
 
-The conceptual guide will not cover step-by-step instructions or specific implementation details — those are found in the [How-To Guides](/docs/how_to/) and [Tutorials](/docs/tutorials) sections. For detailed reference material, please visit the [API Reference](https://python.langchain.com/api_reference/).
+## High level
 
-
-## High Level
-
-- **[Why LangChain?](/docs/concepts/why_langchain)**: Why LangChain is the best choice for building AI applications.
-- **[Architecture](/docs/concepts/architecture)**: Overview of how packages are organized in the LangChain ecosystem.
+- **[Why LangChain?](/docs/concepts/why_langchain)**: Overview of the value that LangChain provides.
+- **[Architecture](/docs/concepts/architecture)**: How packages are organized in the LangChain ecosystem.
 
 ## Concepts
 
-- **[Chat models](/docs/concepts/chat_models)**: LLMs exposed via a chat interface which process sequences of messages as input and output a message.
-- **[Messages](/docs/concepts/messages)**: Messages are the unit of communication in modern LLMs, used to represent input and output of a chat model, as well as any additional context or metadata that may be associated with the conversation.
-- **[Chat history](/docs/concepts/chat_history)**: Chat history is a record of the conversation between the user and the chat model, used to maintain context and state throughout the conversation.
-- **[Tools](/docs/concepts/tools)**: The tool abstraction in LangChain associates a Python function** with a schema defining the function's name, description, and input.
-- **[Tool calling](/docs/concepts/tool_calling)**: Tool calling is a special type of chat model API that allows you to pass tool schemas to a model and get back invocations of those tools.
-- **[Structured output](/docs/concepts/structured_outputs)**: A technique to make the chat model respond in a structured format, such as JSON that's matching a specified schema.
-- **[Memory](https://langchain-ai.github.io/langgraph/concepts/memory/)**: Persisting information from conversations, so it can be used in future conversations.
+- **[Chat models](/docs/concepts/chat_models)**: LLMs exposed via a chat API that process sequences of messages as input and output a message.
+- **[Messages](/docs/concepts/messages)**: The unit of communication in chat models, used to represent model input and output.
+- **[Chat history](/docs/concepts/chat_history)**: A conversation represented as a sequence of messages, alternating between user messages and model responses.
+- **[Tools](/docs/concepts/tools)**: A function with an associated schema defining the function's name, description, and the arguments it accepts.
+- **[Tool calling](/docs/concepts/tool_calling)**: A type of chat model API that accepts tool schemas, along with messages, as input and returns invocations of those tools as part of the output message.
+- **[Structured output](/docs/concepts/structured_outputs)**: A technique to make a chat model respond in a structured format, such as JSON that matches a given schema.
+- **[Memory](https://langchain-ai.github.io/langgraph/concepts/memory/)**: Information about a conversation that is persisted so that it can be used in future conversations.
 - **[Multimodality](/docs/concepts/multimodality)**: The ability to work with data that comes in different forms, such as text, audio, images, and video.
-- **[Tokens](/docs/concepts/tokens)**: Modern large language models (LLMs) are typically based on a transformer architecture that processes a sequence of units known as tokens.
-- **[Runnable interface](/docs/concepts/runnables)**: A standard Runnable interface implemented across many in LangChain components.
-- **[LangChain Expression Language (LCEL)](/docs/concepts/lcel)**: A declarative approach to building pipelines with LangChain components. LCEL servers as a simple orchestration language for LangChain.
-- **[Document loaders](/docs/concepts/document_loaders)**: Components that help loading documents from various sources.
+- **[Runnable interface](/docs/concepts/runnables)**: The base abstraction that many LangChain components and the LangChain Expression Language are built on.
+- **[LangChain Expression Language (LCEL)](/docs/concepts/lcel)**: A syntax for orchestrating LangChain components. Most useful for simpler applications.
+- **[Document loaders](/docs/concepts/document_loaders)**: Load a source as a list of documents.
 - **[Retrieval](/docs/concepts/retrieval)**: Information retrieval systems can retrieve structured or unstructured data from a datasource in response to a query.
-- **[Text splitters](/docs/concepts/text_splitters)**: Use to split long content into smaller more manageable chunks.
-- **[Embedding models](/docs/concepts/embedding_models)**: Embedding models are models that can represent data in a vector space.
-- **[Vector stores](/docs/concepts/vectorstores)**: A datastore that can store embeddings and associated data and supports efficient vector search.
-- **[Retriever](/docs/concepts/retrievers)**: A retriever is a component that retrieves relevant documents from a knowledge base in response to a query.
+- **[Text splitters](/docs/concepts/text_splitters)**: Split long text into smaller chunks that can be individually indexed to enable granular retrieval.
+- **[Embedding models](/docs/concepts/embedding_models)**: Models that represent data such as text or images in a vector space.
+- **[Vector stores](/docs/concepts/vectorstores)**: Storage of and efficient search over vectors and associated metadata.
+- **[Retriever](/docs/concepts/retrievers)**: A component that returns relevant documents from a knowledge base in response to a query.
 - **[Retrieval Augmented Generation (RAG)](/docs/concepts/rag)**: A technique that enhances language models by combining them with external knowledge bases.
-- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tools](/docs/concepts/tools).
-- **[Prompt templates](/docs/concepts/prompt_templates)**: Used to define reusable structures for generating prompts dynamically, allowing for variables or placeholders to be filled in when needed. This is particularly useful with [LCEL](/docs/concepts/lcel) or when prompts need to be stored and retrieved from a database for repeated use.
-- **[Async programming with LangChain](/docs/concepts/async)**: Guidelines about programming with LangChain in an asynchronous context.
-- **[Callbacks](/docs/concepts/callbacks)**: Callbacks are used to stream outputs from LLMs in LangChain, observe the progress of an LLM application, and more.
-- **[Output parsers](/docs/concepts/output_parsers)**: Components that take the output of a model and transform it into a more suitable format for downstream tasks. Output parsers were primarily useful prior to the general availability of [chat models](/docs/concepts/chat_models) that natively support [tool calling](/docs/concepts/tool_calling) and [structured outputs](/docs/concepts/structured_outputs).
-- **[Few shot prompting](/docs/concepts/few_shot_prompting)**: Few-shot prompting is a technique used improve the performance of language models by providing them with a few examples of the task they are expected to perform.
-- **[Example selectors](/docs/concepts/example_selectors)**: Example selectors are used to select examples from a dataset based on a given input. They can be used to select examples randomly, by semantic similarity, or based on some other constraints. Example selectors are used in few-shot prompting to select examples for a prompt.
-- **[Tracing](/docs/concepts/tracing)**: Tracing is the process of recording the steps that an application takes to go from input to output. Tracing is essential for debugging and diagnosing issues in complex applications.
-- **[Evaluation](/docs/concepts/evaluation)**: Evaluation is the process of assessing the performance and effectiveness of your LLM-powered applications. It involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose. This process is vital for building reliable applications. For more information on evaluation in LangChain, see the [LangSmith documentation](https://docs.smith.langchain.com/concepts/evaluation).
+- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tool](/docs/concepts/tools).
+- **[Prompt templates](/docs/concepts/prompt_templates)**: Component for factoring out the static parts of a model "prompt" (usually a sequence of messages). Useful for serializing, versioning, and reusing these static parts.
+- **[Output parsers](/docs/concepts/output_parsers)**: Responsible for taking the output of a model and transforming it into a more suitable format for downstream tasks. Output parsers were primarily useful prior to the general availability of [tool calling](/docs/concepts/tool_calling) and [structured outputs](/docs/concepts/structured_outputs).
+- **[Few-shot prompting](/docs/concepts/few_shot_prompting)**: A technique for improving model performance by providing a few examples of the task to perform in the prompt.
+- **[Example selectors](/docs/concepts/example_selectors)**: Used to select the most relevant examples from a dataset based on a given input. Example selectors are used in few-shot prompting to select examples for a prompt.
+- **[Async programming](/docs/concepts/async)**: The basics that one should know to use LangChain in an asynchronous context.
+- **[Callbacks](/docs/concepts/callbacks)**: Callbacks enable the execution of custom auxiliary code in built-in components. Callbacks are used to stream outputs from LLMs in LangChain, trace the intermediate steps of an application, and more.
+- **[Tracing](/docs/concepts/tracing)**: The process of recording the steps that an application takes to go from input to output. Tracing is essential for debugging and diagnosing issues in complex applications.
+- **[Evaluation](/docs/concepts/evaluation)**: The process of assessing the performance and effectiveness of AI applications. This involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose. This process is vital for building reliable applications.
 
 ## Glossary
 
@@ -53,8 +49,8 @@ The conceptual guide will not cover step-by-step instructions or specific implem
 - **[batch](/docs/concepts/runnables)**: Use to execute a runnable with batch inputs a Runnable.
 - **[bind_tools](/docs/concepts/chat_models#bind-tools)**: Allows models to interact with tools.
 - **[Caching](/docs/concepts/chat_models#caching)**: Storing results to avoid redundant calls to a chat model.
-- **[Chat Models](/docs/concepts/multimodality#chat-models)**: Chat models that handle multiple data modalities.
-- **[Configurable Runnables](/docs/concepts/runnables#configurable-Runnables)**: Creating configurable Runnables.
+- **[Chat models](/docs/concepts/multimodality#chat-models)**: Chat models that handle multiple data modalities.
+- **[Configurable runnables](/docs/concepts/runnables#configurable-Runnables)**: Creating configurable Runnables.
 - **[Context window](/docs/concepts/chat_models#context-window)**: The maximum size of input a chat model can process.
 - **[Conversation patterns](/docs/concepts/chat_history#conversation-patterns)**: Common patterns in chat interactions.
 - **[Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html)**: LangChain's representation of a document.
@@ -64,6 +60,7 @@ The conceptual guide will not cover step-by-step instructions or specific implem
 - **[InjectedStore](/docs/concepts/tools#injectedstore)**: A store that can be injected into a tool for data persistence.
 - **[InjectedToolArg](/docs/concepts/tools#injectedtoolarg)**: Mechanism to inject arguments into tool functions.
 - **[input and output types](/docs/concepts/runnables#input-and-output-types)**: Types used for input and output in Runnables.
+- **[Integration packages](/docs/concepts/architecture#partner-packages)**: Third-party packages that integrate with LangChain.
 - **[invoke](/docs/concepts/runnables)**: A standard method to invoke a Runnable.
 - **[JSON mode](/docs/concepts/structured_outputs#json-mode)**: Returning responses in JSON format.
 - **[langchain-community](/docs/concepts/architecture#langchain-community)**: Community-driven components for LangChain.
@@ -72,23 +69,21 @@ The conceptual guide will not cover step-by-step instructions or specific implem
 - **[langgraph](/docs/concepts/architecture#langgraph)**: Powerful orchestration layer for LangChain. Use to build complex pipelines and workflows.
 - **[langserve](/docs/concepts/architecture#langserve)**: Use to deploy LangChain Runnables as REST endpoints. Uses FastAPI. Works primarily for LangChain Runnables, does not currently integrate with LangGraph.
 - **[Managing chat history](/docs/concepts/chat_history#managing-chat-history)**: Techniques to maintain and manage the chat history.
-- **[Multimodality](/docs/concepts/chat_models#multimodality)**: Capability to process different types of data like text, audio, and images.
 - **[OpenAI format](/docs/concepts/messages#openai-format)**: OpenAI's message format for chat models.
-- **[Partner packages](/docs/concepts/architecture#partner-packages)**: Third-party packages that integrate with LangChain.
-- **[Propagation of RunnableConfig](/docs/concepts/runnables#propagation-runnableconfig)**: Propagating configuration through Runnables. Read if working with python 3.9, 3.10 and async.
+- **[Propagation of RunnableConfig](/docs/concepts/runnables#propagation-RunnableConfig)**: Propagating configuration through Runnables. Read if working with python 3.9, 3.10 and async.
 - **[rate-limiting](/docs/concepts/chat_models#rate-limiting)**: Client side rate limiting for chat models.
 - **[RemoveMessage](/docs/concepts/messages#remove-message)**: An abstraction used to remove a message from chat history, used primarily in LangGraph.
 - **[role](/docs/concepts/messages#role)**: Represents the role (e.g., user, assistant) of a chat message.
-- **[RunnableConfig](/docs/concepts/runnables#runnableconfig)**: Use to pass run time information to Runnables (e.g., `run_name`, `run_id`, `tags`, `metadata`, `max_concurrency`, `recursion_limit`, `configurable`).
+- **[RunnableConfig](/docs/concepts/runnables#RunnableConfig)**: Use to pass run time information to Runnables (e.g., `run_name`, `run_id`, `tags`, `metadata`, `max_concurrency`, `recursion_limit`, `configurable`).
 - **[Standard parameters for chat models](/docs/concepts/chat_models#standard-parameters)**: Parameters such as API key, `temperature`, and `max_tokens`,
 - **[stream](/docs/concepts/streaming)**: Use to stream output from a Runnable or a graph.
 - **[Tokenization](/docs/concepts/tokens)**: The process of converting data into tokens and vice versa.
-- **[Tokens](/docs/concepts/tokens)**: The basic unit that a language model reads, processes, and generates.
+- **[Tokens](/docs/concepts/tokens)**: The basic unit that a language model reads, processes, and generates under the hood.
 - **[Tool artifacts](/docs/concepts/tools#tool-artifacts)**: Add artifacts to the output of a tool that will not be sent to the model, but will be available for downstream processing.
 - **[Tool binding](/docs/concepts/tool_calling#tool-binding)**: Binding tools to models.
 - **[@tool](/docs/concepts/tools#@tool)**: Decorator for creating tools in LangChain.
 - **[Toolkits](/docs/concepts/tools#toolkits)**: A collection of tools that can be used together.
 - **[ToolMessage](/docs/concepts/messages#toolmessage)**: Represents a message that contains the results of a tool execution.
-- **[Vectorstores](/docs/concepts/vectorstores)**: Datastores specialized for storing and efficiently searching vector embeddings.
+- **[Vector stores](/docs/concepts/vectorstores)**: Datastores specialized for storing and efficiently searching vector embeddings.
 - **[with_structured_output](/docs/concepts/chat_models#with-structured-output)**: A helper method for chat models that natively support [tool calling](/docs/concepts/tool_calling) to get structured output matching a given schema specified via Pydantic, JSON schema or a function.
 - **[with_types](/docs/concepts/runnables#with_types)**: Method to overwrite the input and output types of a runnable. Useful when working with complex LCEL chains and deploying with LangServe.
diff --git a/docs/docs/concepts/key_value_stores.mdx b/docs/docs/concepts/key_value_stores.mdx
index aacd897e8f526..d8503dbc09360 100644
--- a/docs/docs/concepts/key_value_stores.mdx
+++ b/docs/docs/concepts/key_value_stores.mdx
@@ -33,6 +33,6 @@ All [`BaseStores`](https://python.langchain.com/api_reference/core/stores/langch
 - `mdelete(key: Sequence[str]) -> None`: delete multiple keys
 - `yield_keys(prefix: Optional[str] = None) -> Iterator[str]`: yield all keys in the store, optionally filtering by a prefix
 
-## Available Integrations
+## Integrations
 
 Please reference the [stores integration page](/docs/integrations/stores/) for a list of available key-value store integrations.
diff --git a/docs/docs/concepts/llms.mdx b/docs/docs/concepts/llms.mdx
index cc32dcc9131ca..5e2f7d98c7256 100644
--- a/docs/docs/concepts/llms.mdx
+++ b/docs/docs/concepts/llms.mdx
@@ -1,3 +1,3 @@
-# Large Language Models (LLMs)
+# Large language models (llms)
 
 Please see the [Chat Model Concept Guide](/docs/concepts/chat_models) page for more information.
\ No newline at end of file
diff --git a/docs/docs/concepts/multimodality.mdx b/docs/docs/concepts/multimodality.mdx
index 1e3ff297daf60..3692e4e1ef1ef 100644
--- a/docs/docs/concepts/multimodality.mdx
+++ b/docs/docs/concepts/multimodality.mdx
@@ -8,7 +8,7 @@
 - **Embedding Models**: Embedding Models can represent multimodal content, embedding various forms of data—such as text, images, and audio—into vector spaces.
 - **Vector Stores**: Vector stores could search over embeddings that represent multimodal data, enabling retrieval across different types of information.
 
-## Multimodality in Chat models
+## Multimodality in chat models
 
 :::info Pre-requisites
 * [Chat models](/docs/concepts/chat_models)
diff --git a/docs/docs/concepts/output_parsers.mdx b/docs/docs/concepts/output_parsers.mdx
index 11c6a196611b5..a03daea8737a5 100644
--- a/docs/docs/concepts/output_parsers.mdx
+++ b/docs/docs/concepts/output_parsers.mdx
@@ -26,7 +26,7 @@ LangChain has lots of different types of output parsers. This is a list of outpu
 
 | Name                                                                                                                                                                                                                                    | Supports Streaming | Has Format Instructions | Calls LLM | Input Type         | Output Type          | Description                                                                                                                                                                                                                                              |
 |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|-------------------------|-----------|--------------------|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| [JSON](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser)                                                     | ✅                  | ✅                       |           | `str` \| `Message` | JSON object          | Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling.                                    |
+| [JSON](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JSONOutputParser.html#langchain_core.output_parsers.json.JSONOutputParser)                                                     | ✅                  | ✅                       |           | `str` \| `Message` | JSON object          | Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling.                                    |
 | [XML](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html#langchain_core.output_parsers.xml.XMLOutputParser)                                                          | ✅                  | ✅                       |           | `str` \| `Message` | `dict`               | Returns a dictionary of tags. Use when XML output is needed. Use with models that are good at writing XML (like Anthropic's).                                                                                                                            |
 | [CSV](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.list.CommaSeparatedListOutputParser.html#langchain_core.output_parsers.list.CommaSeparatedListOutputParser)                          | ✅                  | ✅                       |           | `str` \| `Message` | `List[str]`          | Returns a list of comma separated values.                                                                                                                                                                                                                |
 | [OutputFixing](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#langchain.output_parsers.fix.OutputFixingParser)                                                |                    |                         | ✅         | `str` \| `Message` |                      | Wraps another output parser. If that output parser errors, then this will pass the error message and the bad output to an LLM and ask it to fix the output.                                                                                              |
diff --git a/docs/docs/concepts/rag.mdx b/docs/docs/concepts/rag.mdx
index dc459df95a477..eb4752b6ffe2d 100644
--- a/docs/docs/concepts/rag.mdx
+++ b/docs/docs/concepts/rag.mdx
@@ -1,4 +1,4 @@
-# Retrieval Augmented Generation (RAG)
+# Retrieval augmented generation (rag)
 
 :::info[Prerequisites]
 
@@ -15,7 +15,7 @@ The system then incorporates this retrieved information into the model's prompt.
 The model uses the provided context to generate a response to the query.
 By bridging the gap between vast language models and dynamic, targeted information retrieval, RAG is a powerful technique for building more capable and reliable AI systems.
 
-## Key Concepts
+## Key concepts
 
 ![Conceptual Overview](/img/rag_concepts.png)
 
diff --git a/docs/docs/concepts/retrieval.mdx b/docs/docs/concepts/retrieval.mdx
index eb0fe33305c0b..37bb1eb506d41 100644
--- a/docs/docs/concepts/retrieval.mdx
+++ b/docs/docs/concepts/retrieval.mdx
@@ -39,7 +39,7 @@ This translation enables more intuitive and flexible interactions with complex d
 
 (2) **Information retrieval**: Search queries are used to fetch information from various retrieval systems.
 
-## Query Analysis 
+## Query analysis 
 
 While users typically prefer to interact with retrieval systems using natural language, retrieval systems can specific query syntax or benefit from particular keywords. 
 Query analysis serves as a bridge between raw user input and optimized search queries. Some common applications of query analysis include:
@@ -49,7 +49,7 @@ Query analysis serves as a bridge between raw user input and optimized search qu
 
 Query analysis employs models to transform or construct optimized search queries from raw user input. 
 
-### Query Re-writing
+### Query re-writing
 
 Retrieval systems should ideally handle a wide spectrum of user inputs, from simple and poorly worded queries to complex, multi-faceted questions. 
 To achieve this versatility, a popular approach is to use models to transform raw user queries into more effective search queries. 
@@ -78,7 +78,7 @@ from pydantic import BaseModel, Field
 from langchain_openai import ChatOpenAI
 from langchain_core.messages import SystemMessage, HumanMessage
 
-# Define a Pydantic model to enforce the output structure
+# Define a pydantic model to enforce the output structure
 class Questions(BaseModel):
     questions: List[str] = Field(
         description="A list of sub-questions related to the input query."
@@ -107,7 +107,7 @@ See our RAG from Scratch videos for a few different specific approaches:
 
 :::
 
-### Query Construction
+### Query construction
 
 Query analysis also can focus on translating natural language queries into specialized query languages or filters. 
 This translation is crucial for effectively interacting with various types of databases that house structured or semi-structured data.
@@ -149,7 +149,7 @@ retriever = SelfQueryRetriever.from_llm(
 
 ::: 
 
-## Information Retrieval 
+## Information retrieval 
 
 ### Common retrieval systems
 
diff --git a/docs/docs/concepts/retrievers.mdx b/docs/docs/concepts/retrievers.mdx
index d4e0a899fccf5..5aaa893b7fde1 100644
--- a/docs/docs/concepts/retrievers.mdx
+++ b/docs/docs/concepts/retrievers.mdx
@@ -54,13 +54,13 @@ Retrievers return a list of [Document](https://api.python.langchain.com/en/lates
 
 Despite the flexibility of the retriever interface, a few common types of retrieval systems are frequently used.
 
-### Search APIs
+### Search apis
 
 It's important to note that retrievers don't need to actually *store* documents. 
 For example, we can be built retrievers on top of search APIs that simply return search results! 
 See our retriever integrations with [Amazon Kendra](https://python.langchain.com/docs/integrations/retrievers/amazon_kendra_retriever/) or [Wikipedia Search](https://python.langchain.com/docs/integrations/retrievers/wikipedia/). 
 
-### Relational or Graph Database
+### Relational or graph database
 
 Retrievers can be built on top of relational or graph databases.
 In these cases, [query analysis](/docs/concepts/retrieval/) techniques to construct a structured query from natural language is critical.
@@ -73,7 +73,7 @@ For example, you can build a retriever for a SQL database using text-to-SQL conv
 
 :::
 
-### Lexical Search
+### Lexical search
 
 As discussed in our conceptual review of [retrieval](/docs/concepts/retrieval/), many search engines are based upon matching words in a query to the words in each document. 
 [BM25](https://en.wikipedia.org/wiki/Okapi_BM25#:~:text=BM25%20is%20a%20bag%2Dof,slightly%20different%20components%20and%20parameters.) and [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) are [two popular lexical search algorithms](https://cameronrwolfe.substack.com/p/the-basics-of-ai-powered-vector-search?utm_source=profile&utm_medium=reader2).
@@ -106,7 +106,7 @@ This is particularly useful when you have multiple retrievers that are good at f
 It is easy to create an [ensemble retriever](/docs/how_to/ensemble_retriever/) that combines multiple retrievers with linear weighted scores:
 
 ```python
-# initialize the ensemble retriever
+# Initialize the ensemble retriever
 ensemble_retriever = EnsembleRetriever(
     retrievers=[bm25_retriever, vector_store_retriever], weights=[0.5, 0.5]
 )
@@ -115,7 +115,7 @@ ensemble_retriever = EnsembleRetriever(
 When ensembling, how do we combine search results from many retrievers? 
 This motivates the concept of re-ranking, which takes the output of multiple retrievers and combines them using a more sophisticated algorithm such as [Reciprocal Rank Fusion (RRF)](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf).
 
-### Source Document Retention 
+### Source document retention 
 
 Many retrievers utilize some kind of index to make documents easily searchable.
 The process of indexing can include a transformation step (e.g., vectorstores often use document splitting). 
diff --git a/docs/docs/concepts/runnables.mdx b/docs/docs/concepts/runnables.mdx
index 3d18e567a2589..678d38bddf7b6 100644
--- a/docs/docs/concepts/runnables.mdx
+++ b/docs/docs/concepts/runnables.mdx
@@ -1,4 +1,4 @@
-# Runnable Interface
+# Runnable interface
 
 The Runnable interface is foundational for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs](
 https://langchain-ai.github.io/langgraph/concepts/low_level/#compiling-your-graph) and more.
@@ -10,7 +10,7 @@ This guide covers the main concepts and methods of the Runnable interface, which
 * A list of built-in `Runnables` can be found in the [LangChain Core API Reference](https://python.langchain.com/api_reference/core/runnables.html). Many of these Runnables are useful when composing custom "chains" in LangChain using the [LangChain Expression Language (LCEL)](/docs/concepts/lcel).
 :::
 
-## Overview of Runnable Interface
+## Overview of runnable interface
 
 The Runnable way defines a standard interface that allows a Runnable component to be:
 
@@ -23,7 +23,7 @@ The Runnable way defines a standard interface that allows a Runnable component t
 Please review the [LCEL Cheatsheet](/docs/how_to/lcel_cheatsheet) for some common patterns that involve the Runnable interface and LCEL expressions.
 
 <a id="batch"></a>
-### Optimized Parallel Execution (Batch)
+### Optimized parallel execution (batch)
 <span data-heading-keywords="batch"></span>
 
 LangChain Runnables offer a built-in `batch` (and `batch_as_completed`) API that allow you to process multiple inputs in parallel.
@@ -46,19 +46,19 @@ The async versions of `abatch` and `abatch_as_completed` these rely on asyncio's
 :::
 
 :::tip
-When processing a large number of inputs using `batch` or `batch_as_completed`, users may want to control the maximum number of parallel calls. This can be done by setting the `max_concurrency` attribute in the `RunnableConfig` dictionary. See the [RunnableConfig](/docs/concepts/runnables#runnableconfig) for more information.
+When processing a large number of inputs using `batch` or `batch_as_completed`, users may want to control the maximum number of parallel calls. This can be done by setting the `max_concurrency` attribute in the `RunnableConfig` dictionary. See the [RunnableConfig](/docs/concepts/runnables#RunnableConfig) for more information.
 
 Chat Models also have a built-in [rate limiter](/docs/concepts/chat_models#rate-limiting) that can be used to control the rate at which requests are made.
 :::
 
-### Asynchronous Support
+### Asynchronous support
 <span data-heading-keywords="async-api"></span>
 
 Runnables expose an asynchronous API, allowing them to be called using the `await` syntax in Python. Asynchronous methods can be identified by the "a" prefix (e.g., `ainvoke`, `abatch`, `astream`, `abatch_as_completed`).
 
 Please refer to the [Async Programming with LangChain](/docs/concepts/async) guide for more details.
 
-## Streaming APIs
+## Streaming apis
 <span data-heading-keywords="streaming-api"></span>
 
 Streaming is critical in making applications based on LLMs feel responsive to end-users.
@@ -71,7 +71,7 @@ Runnables expose the following three streaming APIs:
 
 Please refer to the [Streaming Conceptual Guide](/docs/concepts/streaming) for more details on how to stream in LangChain.
 
-## Input and Output Types
+## Input and output types
 
 Every `Runnable` is characterized by an input and output type. These input and output types can be any Python object, and are defined by the Runnable itself.
 
@@ -94,7 +94,7 @@ The **input type** and **output type** vary by component:
 
 Please refer to the individual component documentation for more information on the input and output types and how to use them.
 
-### Inspecting Schemas
+### Inspecting schemas
 
 :::note
 This is an advanced feature that is unnecessary for most users. You should probably
@@ -121,7 +121,7 @@ Please see the [Configurable Runnables](#configurable-runnables) section for mor
 | `get_config_jsonschema` | Gives the JSONSchema of the config schema for the Runnable.      |
 
 
-#### with_types
+#### With_types
 
 LangChain will automatically try to infer the input and output types of a Runnable based on available information.
 
@@ -131,7 +131,7 @@ Currently, this inference does not work well for more complex Runnables that are
 ## RunnableConfig
 
 Any of the methods that are used to execute the runnable (e.g., `invoke`, `batch`, `stream`, `astream_events`) accept a second argument called
-`RunnableConfig` ([API Reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html#runnableconfig)). This argument is a dictionary that contains configuration for the Runnable that will be used
+`RunnableConfig` ([API Reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html#RunnableConfig)). This argument is a dictionary that contains configuration for the Runnable that will be used
 at run time during the execution of the runnable.
 
 A `RunnableConfig` can have any of the following properties defined:
@@ -212,7 +212,7 @@ attempting to stream data using `astream_events` and `astream_log` as these meth
 rely on proper propagation of [callbacks](/docs/concepts/callbacks) defined inside of `RunnableConfig`.
 :::
 
-### Setting Custom Run Name, Tags, and Metadata
+### Setting custom run name, tags, and metadata
 
 The `run_name`, `tags`, and `metadata` attributes of the `RunnableConfig` dictionary can be used to set custom values for the run name, tags, and metadata for a given Runnable.
 
@@ -229,7 +229,7 @@ The attributes will also be propagated to [callbacks](/docs/concepts/callbacks),
 * [How-to trace with LangChain](https://docs.smith.langchain.com/how_to_guides/tracing/trace_with_langchain)
 :::
 
-### Setting Run ID
+### Setting run id
 
 :::note
 This is an advanced feature that is unnecessary for most users.
@@ -255,10 +255,10 @@ some_runnable.invoke(
    }
 )
 
-# do something with the run_id
+# Do something with the run_id
 ```
 
-### Setting Recursion Limit
+### Setting recursion limit
 
 :::note
 This is an advanced feature that is unnecessary for most users.
@@ -266,7 +266,7 @@ This is an advanced feature that is unnecessary for most users.
 
 Some Runnables may return other Runnables, which can lead to infinite recursion if not handled properly. To prevent this, you can set a `recursion_limit` in the `RunnableConfig` dictionary. This will limit the number of times a Runnable can recurse.
 
-### Setting Max Concurrency
+### Setting max concurrency
 
 If using the `batch` or `batch_as_completed` methods, you can set the `max_concurrency` attribute in the `RunnableConfig` dictionary to control the maximum number of parallel calls to make. This can be useful when you want to limit the number of parallel calls to prevent overloading a server or API.
 
@@ -290,7 +290,7 @@ a `session_id` / `conversation_id` to keep track of conversation history.
 
 In addition, you can use it to specify any custom configuration options to pass to any [Configurable Runnable](#configurable-runnables) that they create.
 
-### Setting Callbacks
+### Setting callbacks
 
 Use this option to configure [callbacks](/docs/concepts/callbacks) for the runnable at 
 runtime. The callbacks will be passed to all sub-calls made by the runnable.
@@ -312,10 +312,10 @@ Please read the [Callbacks Conceptual Guide](/docs/concepts/callbacks) for more
 :::important
 If you're using Python 3.9 or 3.10 in an async environment, you must propagate
 the `RunnableConfig` manually to sub-calls in some cases. Please see the
-[Propagating RunnableConfig](#propagation-of-runnableconfig) section for more information.
+[Propagating RunnableConfig](#propagation-of-RunnableConfig) section for more information.
 :::
 
-## Creating a Runnable from a function
+## Creating a runnable from a function
 
 You may need to create a custom Runnable that runs arbitrary logic. This is especially
 useful if using [LangChain Expression Language (LCEL)](/docs/concepts/lcel) to compose
@@ -333,7 +333,7 @@ Users should not try to subclass Runnables to create a new custom Runnable. It i
 much more complex and error-prone than simply using `RunnableLambda` or `RunnableGenerator`.
 :::
 
-## Configurable Runnables
+## Configurable runnables
 
 :::note
 This is an advanced feature that is unnecessary for most users.
diff --git a/docs/docs/concepts/structured_outputs.mdx b/docs/docs/concepts/structured_outputs.mdx
index a5d1f9dbbd98a..f58150d5c609d 100644
--- a/docs/docs/concepts/structured_outputs.mdx
+++ b/docs/docs/concepts/structured_outputs.mdx
@@ -1,4 +1,4 @@
-# Structured Outputs
+# Structured outputs
 
 ## Overview 
 
@@ -9,7 +9,7 @@ This need motivates the concept of structured output, where models can be instru
 
 ![Structured output](/img/structured_output.png)
 
-## Key Concepts 
+## Key concepts 
 
 **(1) Schema definition:** The output structure is represented as a schema, which can be defined in several ways. 
 **(2) Returning structured output:** The model is given this schema, and is instructed to return output that conforms to it.
@@ -62,7 +62,7 @@ With a schema defined, we need a way to instruct the model to use it.
 While one approach is to include this schema in the prompt and *ask nicely* for the model to use it, this is not recommended. 
 Several more powerful methods that utilizes native features in the model provider's API are available.
 
-### Using Tool Calling
+### Using tool calling
 
 Many [model providers support](/docs/integrations/chat/) tool calling, a concept discussed in more detail in our [tool calling guide](/docs/concepts/tool_calling/).
 In short, tool calling involves binding a tool to a model and, when appropriate, the model can *decide* to call this tool and ensure its response conforms to the tool's schema.
@@ -72,7 +72,7 @@ Here is an example using the `ResponseFormatter` schema defined above:
 ```python
 from langchain_openai import ChatOpenAI
 model = ChatOpenAI(model="gpt-4o", temperature=0)
-# Bind ResponseFormatter schema as a tool to the model
+# Bind responseformatter schema as a tool to the model
 model_with_tools = model.bind_tools([ResponseFormatter])
 # Invoke the model
 ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
@@ -86,7 +86,7 @@ This dictionary can be optionally parsed into a Pydantic object, matching our or
 ai_msg.tool_calls[0]["args"]
 {'answer': "The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
  'followup_question': 'What is the function of ATP in the cell?'}
-# Parse the dictionary into a Pydantic object
+# Parse the dictionary into a pydantic object
 pydantic_object = ResponseFormatter.model_validate(ai_msg.tool_calls[0]["args"])
 ```
 
@@ -136,7 +136,7 @@ This both binds the schema to the model as a tool and parses the output to the s
 model_with_structure = model.with_structured_output(ResponseFormatter)
 # Invoke the model
 structured_output = model_with_structure.invoke("What is the powerhouse of the cell?")
-# Get back the Pydantic object
+# Get back the pydantic object
 structured_output
 ResponseFormatter(answer="The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.", followup_question='What is the function of ATP in the cell?')
 ```
diff --git a/docs/docs/concepts/text_splitters.mdx b/docs/docs/concepts/text_splitters.mdx
index cd72ecfc85280..c5575a219f513 100644
--- a/docs/docs/concepts/text_splitters.mdx
+++ b/docs/docs/concepts/text_splitters.mdx
@@ -1,4 +1,4 @@
-# Text Splitters
+# Text splitters
 <span data-heading-keywords="text splitter,text splitting"></span>
 
 :::info[Prerequisites]
diff --git a/docs/docs/concepts/tool_calling.mdx b/docs/docs/concepts/tool_calling.mdx
index 674d78d2ee31a..e377688334640 100644
--- a/docs/docs/concepts/tool_calling.mdx
+++ b/docs/docs/concepts/tool_calling.mdx
@@ -1,4 +1,4 @@
-# Tool Calling
+# Tool calling
 
 :::info[Prerequisites]
 * [Tools](/docs/concepts/tools)
@@ -19,7 +19,7 @@ You will sometimes hear the term `function calling`. We use this term interchang
 
 ![Conceptual overview of tool calling](/img/tool_calling_concept.png)
 
-## Key Concepts 
+## Key concepts 
 
 **(1) Tool Creation:** Use the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator to create a [tool](/docs/concepts/tools). A tool is an association between a function and its schema.
 **(2) Tool Binding:** The tool needs to be connected to a model that supports tool calling. This gives the model awareness of the tool and the associated input schema required by the tool.
@@ -44,7 +44,7 @@ model_with_tools = model.bind_tools(tools)
 response = model_with_tools.invoke(user_input)
 ```
 
-## Tool Creation
+## Tool creation
 
 The recommended way to create a tool is using the `@tool` decorator.
 
@@ -65,7 +65,7 @@ def multiply(a: int, b: int) -> int:
 
 :::
 
-## Tool Binding 
+## Tool binding 
 
 [Many](https://platform.openai.com/docs/guides/function-calling) [model providers](https://platform.openai.com/docs/guides/function-calling) support tool calling. 
 
@@ -95,7 +95,7 @@ def multiply(a: int, b: int) -> int:
 llm_with_tools = tool_calling_model.bind_tools([multiply])
 ```
 
-## Tool Calling
+## Tool calling
 
 ![Diagram of a tool call by a model](/img/tool_call_example.png)
 
diff --git a/docs/docs/concepts/tools.mdx b/docs/docs/concepts/tools.mdx
index 5cf655a8a6626..fe5910cdb2237 100644
--- a/docs/docs/concepts/tools.mdx
+++ b/docs/docs/concepts/tools.mdx
@@ -10,7 +10,7 @@ The **tool** abstraction in LangChain associates a python **function** with a **
 
 **Tools** can be passed to [chat models](/docs/concepts/chat_models) that support [tool calling](/docs/concepts/tool_calling) allowing the model to request the execution of a specific function with specific inputs.
 
-## Key Concepts
+## Key concepts
 
 - Tools are a way to encapsulate a function and its schema in a way that can be passed to a chat model.
 - Create tools using the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator, which simplifies the process of tool creation, supporting the following:
@@ -18,7 +18,7 @@ The **tool** abstraction in LangChain associates a python **function** with a **
    - Defining tools that return **artifacts** (e.g. images, dataframes, etc.)
    - Hiding input arguments from the schema (and hence from the model) using **injected tool arguments**.
 
-## Tool Interface
+## Tool interface
 
 The tool interface is defined in the [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.base.BaseTool.html#langchain_core.tools.base.BaseTool) class which is a subclass of the [Runnable Interface](/docs/concepts/runnables).
 
@@ -70,9 +70,9 @@ print(multiply.name) # multiply
 print(multiply.description) # Multiply two numbers.
 print(multiply.args) 
 # {
-#   'type': 'object', 
-#   'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 
-#   'required': ['a', 'b']
+# 'type': 'object', 
+# 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 
+# 'required': ['a', 'b']
 # }
 ```
 
@@ -141,7 +141,7 @@ See [how to pass run time values to tools](https://python.langchain.com/docs/how
 
 You can use the `RunnableConfig` object to pass custom run time values to tools.
 
-If you need to access the [RunnableConfig](/docs/concepts/runnables/#runnableconfig) object from within a tool. This can be done by using the `RunnableConfig` annotation in the tool's function signature.
+If you need to access the [RunnableConfig](/docs/concepts/runnables/#RunnableConfig) object from within a tool. This can be done by using the `RunnableConfig` annotation in the tool's function signature.
 
 ```python
 from langchain_core.runnables import RunnableConfig
@@ -160,7 +160,7 @@ The `config` will not be part of the tool's schema and will be injected at runti
 :::note
 You may need to access the `config` object to manually propagate it to subclass. This happens if you're working with python 3.9 / 3.10 in an [async](/docs/concepts/async) environment and need to manually propagate the `config` object to sub-calls.
 
-Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-runnableconfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
+Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-RunnableConfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
 :::
 
 ### InjectedState
@@ -198,7 +198,7 @@ toolkit = ExampleTookit(...)
 tools = toolkit.get_tools()
 ```
 
-## Related Resources
+## Related resources
 
 See the following resources for more information:
 
diff --git a/docs/docs/concepts/vectorstores.mdx b/docs/docs/concepts/vectorstores.mdx
index 23ed6623c1a5d..29a0786ef09f9 100644
--- a/docs/docs/concepts/vectorstores.mdx
+++ b/docs/docs/concepts/vectorstores.mdx
@@ -37,7 +37,7 @@ from pinecone import Pinecone
 from langchain_openai import OpenAIEmbeddings
 from langchain_pinecone import PineconeVectorStore
 
-# Initialize Pinecone
+# Initialize pinecone
 pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
 
 # Initialize with an embedding model
@@ -117,7 +117,7 @@ pc.create_index(
 
 :::
 
-### Similarity Search
+### Similarity search
 
 Given a similarity metric to measure the distance between the embedded query and any embedded document, we need an algorithm to efficiently search over *all* the embedded documents to find the most similar ones.
 There are various ways to do this. As an example, many vectorstores implement [HNSW (Hierarchical Navigable Small World)](https://www.pinecone.io/learn/series/faiss/hnsw/), a graph-based index structure that allows for efficient similarity search.
diff --git a/docs/docs/concepts/why_langchain.mdx b/docs/docs/concepts/why_langchain.mdx
index d17c48f6460f7..1eae06eea3705 100644
--- a/docs/docs/concepts/why_langchain.mdx
+++ b/docs/docs/concepts/why_langchain.mdx
@@ -1,4 +1,4 @@
-# Why LangChain?
+# Why langchain?
 
 The goal of `langchain` the Python package and LangChain the company is to make it as easy possible for developers to build applications that reason.
 While LangChain originally started as a single open source package, it has evolved into a company and a whole ecosystem.
@@ -29,7 +29,7 @@ As an example, all [chat models](/docs/concepts/chat_models/) implement the [Bas
 This provides a standard way to interact with chat models, supporting important but often provider-specific features like [tool calling](/docs/concepts/tool_calling/) and [structured outputs](/docs/concepts/structured_outputs/).
 
 
-### Example: Chat models 
+### Example: chat models 
 
 Many [model providers](/docs/concepts/chat_models/) support [tool calling](/docs/concepts/tool_calling/), a critical features for many applications (e.g., [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)), that allows a developer to request model responses that match a particular schema.
 The APIs for each provider differ. 
@@ -53,7 +53,7 @@ schema = ...
 model_with_structure = model.with_structured_output(schema)
 ```
 
-### Example: Retrievers
+### Example: retrievers
 
 In the context of [RAG](/docs/concepts/rag/) and LLM application components, LangChain's [retriever](/docs/concepts/retrievers/) interface provides a standard way to connect to many different types of data services or databases (e.g., [vector stores](/docs/concepts/vectorstores) or databases).
 The underlying implementation of the retriever depends on the type of data store or database you are connecting to, but all retrievers implement the [runnable interface](/docs/concepts/runnables/), meaning they can be invoked in a common manner.