diff --git a/docs/core_docs/docs/integrations/chat/ollama.ipynb b/docs/core_docs/docs/integrations/chat/ollama.ipynb new file mode 100644 index 000000000000..1ed84b3130f8 --- /dev/null +++ b/docs/core_docs/docs/integrations/chat/ollama.ipynb @@ -0,0 +1,559 @@ +{ + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": { + "vscode": { + "languageId": "raw" + } + }, + "source": [ + "---\n", + "sidebar_label: Ollama\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "e49f1e0d", + "metadata": {}, + "source": [ + "# ChatOllama\n", + "\n", + "This will help you getting started with `ChatOllama` [chat models](/docs/concepts/#chat-models). For detailed documentation of all `ChatOllama` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_ollama.ChatOllama.html).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally.\n", + "\n", + "Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage.\n", + "\n", + "This example goes over how to use LangChain to interact with an Ollama-run Llama 2 7b instance as a chat model.\n", + "For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library).\n", + "\n", + "| Class | Package | Local | Serializable | [PY support](https://python.langchain.com/v0.2/docs/integrations/chat/ollama) | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatOllama](https://api.js.langchain.com/classes/langchain_ollama.ChatOllama.html) | [@langchain/ollama](https://api.js.langchain.com/modules/langchain_ollama.html) | ✅ | beta | ✅ | ![NPM - Downloads](https://img.shields.io/npm/dm/@langchain/ollama?style=flat-square&label=%20&) | ![NPM - Version](https://img.shields.io/npm/v/@langchain/ollama?style=flat-square&label=%20&) |\n", + "\n", + "### Model features\n", + "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | \n", + "\n", + "## Setup\n", + "\n", + "Follow [these instructions](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance. Then, download the `@langchain/ollama` package.\n", + "\n", + "### Credentials\n", + "\n", + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:\n", + "\n", + "```bash\n", + "# export LANGCHAIN_TRACING_V2=\"true\"\n", + "# export LANGCHAIN_API_KEY=\"your-api-key\"\n", + "```\n", + "\n", + "### Installation\n", + "\n", + "The LangChain ChatOllama integration lives in the `@langchain/ollama` package:\n", + "\n", + "```{=mdx}\n", + "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n", + "import Npm2Yarn from \"@theme/Npm2Yarn\";\n", + "\n", + "\n", + "\n", + "\n", + " @langchain/ollama\n", + "\n", + "\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "a38cde65-254d-4219-a441-068766c0d4b5", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae", + "metadata": {}, + "outputs": [], + "source": [ + "import { ChatOllama } from \"@langchain/ollama\"\n", + "\n", + "const llm = new ChatOllama({\n", + " model: \"llama3\",\n", + " temperature: 0,\n", + " maxRetries: 2,\n", + " // other params...\n", + "})" + ] + }, + { + "cell_type": "markdown", + "id": "2b4f3e15", + "metadata": {}, + "source": [ + "## Invocation" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "62e0dbc3", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "AIMessage {\n", + " \"content\": \"Je adore le programmation.\\n\\n(Note: \\\"programmation\\\" is the feminine form of the noun in French, but if you want to use the masculine form, it would be \\\"le programme\\\" instead.)\",\n", + " \"additional_kwargs\": {},\n", + " \"response_metadata\": {\n", + " \"model\": \"llama3\",\n", + " \"created_at\": \"2024-08-01T16:59:17.359302Z\",\n", + " \"done_reason\": \"stop\",\n", + " \"done\": true,\n", + " \"total_duration\": 6399311167,\n", + " \"load_duration\": 5575776417,\n", + " \"prompt_eval_count\": 35,\n", + " \"prompt_eval_duration\": 110053000,\n", + " \"eval_count\": 43,\n", + " \"eval_duration\": 711744000\n", + " },\n", + " \"tool_calls\": [],\n", + " \"invalid_tool_calls\": [],\n", + " \"usage_metadata\": {\n", + " \"input_tokens\": 35,\n", + " \"output_tokens\": 43,\n", + " \"total_tokens\": 78\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "const aiMsg = await llm.invoke([\n", + " [\n", + " \"system\",\n", + " \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n", + " ],\n", + " [\"human\", \"I love programming.\"],\n", + "])\n", + "aiMsg" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Je adore le programmation.\n", + "\n", + "(Note: \"programmation\" is the feminine form of the noun in French, but if you want to use the masculine form, it would be \"le programme\" instead.)\n" + ] + } + ], + "source": [ + "console.log(aiMsg.content)" + ] + }, + { + "cell_type": "markdown", + "id": "18e2bfc0-7e78-4528-a73f-499ac150dca8", + "metadata": {}, + "source": [ + "## Chaining\n", + "\n", + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "AIMessage {\n", + " \"content\": \"Ich liebe Programmieren!\\n\\n(Note: \\\"Ich liebe\\\" means \\\"I love\\\", \\\"Programmieren\\\" is the verb for \\\"programming\\\")\",\n", + " \"additional_kwargs\": {},\n", + " \"response_metadata\": {\n", + " \"model\": \"llama3\",\n", + " \"created_at\": \"2024-08-01T16:59:18.088423Z\",\n", + " \"done_reason\": \"stop\",\n", + " \"done\": true,\n", + " \"total_duration\": 585146125,\n", + " \"load_duration\": 27557166,\n", + " \"prompt_eval_count\": 30,\n", + " \"prompt_eval_duration\": 74241000,\n", + " \"eval_count\": 29,\n", + " \"eval_duration\": 481195000\n", + " },\n", + " \"tool_calls\": [],\n", + " \"invalid_tool_calls\": [],\n", + " \"usage_metadata\": {\n", + " \"input_tokens\": 30,\n", + " \"output_tokens\": 29,\n", + " \"total_tokens\": 59\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "import { ChatPromptTemplate } from \"@langchain/core/prompts\"\n", + "\n", + "const prompt = ChatPromptTemplate.fromMessages(\n", + " [\n", + " [\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ],\n", + " [\"human\", \"{input}\"],\n", + " ]\n", + ")\n", + "\n", + "const chain = prompt.pipe(llm);\n", + "await chain.invoke(\n", + " {\n", + " input_language: \"English\",\n", + " output_language: \"German\",\n", + " input: \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "d1ee55bc-ffc8-4cfa-801c-993953a08cfd", + "metadata": {}, + "source": [ + "## Tools\n", + "\n", + "Ollama now offers support for native tool calling. The example below demonstrates how you can invoke a tool from an Ollama model." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "d2502c0d", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "AIMessage {\n", + " \"content\": \"\",\n", + " \"additional_kwargs\": {},\n", + " \"response_metadata\": {\n", + " \"model\": \"llama3-groq-tool-use\",\n", + " \"created_at\": \"2024-08-01T18:43:13.2181Z\",\n", + " \"done_reason\": \"stop\",\n", + " \"done\": true,\n", + " \"total_duration\": 2311023875,\n", + " \"load_duration\": 1560670292,\n", + " \"prompt_eval_count\": 177,\n", + " \"prompt_eval_duration\": 263603000,\n", + " \"eval_count\": 30,\n", + " \"eval_duration\": 485582000\n", + " },\n", + " \"tool_calls\": [\n", + " {\n", + " \"name\": \"get_current_weather\",\n", + " \"args\": {\n", + " \"location\": \"San Francisco, CA\"\n", + " },\n", + " \"id\": \"c7a9d590-99ad-42af-9996-41b90efcf827\",\n", + " \"type\": \"tool_call\"\n", + " }\n", + " ],\n", + " \"invalid_tool_calls\": [],\n", + " \"usage_metadata\": {\n", + " \"input_tokens\": 177,\n", + " \"output_tokens\": 30,\n", + " \"total_tokens\": 207\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "import { tool } from \"@langchain/core/tools\";\n", + "import { ChatOllama } from \"@langchain/ollama\";\n", + "import { z } from \"zod\";\n", + "\n", + "const weatherTool = tool((_) => \"Da weather is weatherin\", {\n", + " name: \"get_current_weather\",\n", + " description: \"Get the current weather in a given location\",\n", + " schema: z.object({\n", + " location: z.string().describe(\"The city and state, e.g. San Francisco, CA\"),\n", + " }),\n", + "});\n", + "\n", + "// Define the model\n", + "const llmForTool = new ChatOllama({\n", + " model: \"llama3-groq-tool-use\",\n", + "});\n", + "\n", + "// Bind the tool to the model\n", + "const llmWithTools = llmForTool.bindTools([weatherTool]);\n", + "\n", + "const resultFromTool = await llmWithTools.invoke(\n", + " \"What's the weather like today in San Francisco? Ensure you use the 'get_current_weather' tool.\"\n", + ");\n", + "\n", + "console.log(resultFromTool);" + ] + }, + { + "cell_type": "markdown", + "id": "47faa093", + "metadata": {}, + "source": [ + "### `.withStructuredOutput`\n", + "\n", + "Since `ChatOllama` supports the `.bindTools()` method, you can also call `.withStructuredOutput()` to get a structured output from the tool." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "759924f6", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{ location: 'San Francisco, CA' }\n" + ] + } + ], + "source": [ + "import { ChatOllama } from \"@langchain/ollama\";\n", + "import { z } from \"zod\";\n", + "\n", + "// Define the model\n", + "const llmForWSO = new ChatOllama({\n", + " model: \"llama3-groq-tool-use\",\n", + "});\n", + "\n", + "// Define the tool schema you'd like the model to use.\n", + "const schemaForWSO = z.object({\n", + " location: z.string().describe(\"The city and state, e.g. San Francisco, CA\"),\n", + "});\n", + "\n", + "// Pass the schema to the withStructuredOutput method to bind it to the model.\n", + "const llmWithStructuredOutput = llmForWSO.withStructuredOutput(schemaForWSO, {\n", + " name: \"get_current_weather\",\n", + "});\n", + "\n", + "const resultFromWSO = await llmWithStructuredOutput.invoke(\n", + " \"What's the weather like today in San Francisco? Ensure you use the 'get_current_weather' tool.\"\n", + ");\n", + "console.log(resultFromWSO);" + ] + }, + { + "cell_type": "markdown", + "id": "cb1377af", + "metadata": {}, + "source": [ + "### JSON mode\n", + "\n", + "Ollama also supports a JSON mode that coerces model outputs to only return JSON. Here's an example of how this can be useful for extraction:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "de94282b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "AIMessage {\n", + " \"content\": \"{\\n\\\"original\\\": \\\"I love programming\\\",\\n\\\"translated\\\": \\\"Ich liebe Programmierung\\\"\\n}\",\n", + " \"additional_kwargs\": {},\n", + " \"response_metadata\": {\n", + " \"model\": \"llama3\",\n", + " \"created_at\": \"2024-08-01T17:24:54.35568Z\",\n", + " \"done_reason\": \"stop\",\n", + " \"done\": true,\n", + " \"total_duration\": 1754811583,\n", + " \"load_duration\": 1297200208,\n", + " \"prompt_eval_count\": 47,\n", + " \"prompt_eval_duration\": 128532000,\n", + " \"eval_count\": 20,\n", + " \"eval_duration\": 318519000\n", + " },\n", + " \"tool_calls\": [],\n", + " \"invalid_tool_calls\": [],\n", + " \"usage_metadata\": {\n", + " \"input_tokens\": 47,\n", + " \"output_tokens\": 20,\n", + " \"total_tokens\": 67\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "import { ChatOllama } from \"@langchain/ollama\";\n", + "import { ChatPromptTemplate } from \"@langchain/core/prompts\";\n", + "\n", + "const promptForJsonMode = ChatPromptTemplate.fromMessages([\n", + " [\n", + " \"system\",\n", + " `You are an expert translator. Format all responses as JSON objects with two keys: \"original\" and \"translated\".`,\n", + " ],\n", + " [\"human\", `Translate \"{input}\" into {language}.`],\n", + "]);\n", + "\n", + "const llmJsonMode = new ChatOllama({\n", + " baseUrl: \"http://localhost:11434\", // Default value\n", + " model: \"llama3\",\n", + " format: \"json\",\n", + "});\n", + "\n", + "const chainForJsonMode = promptForJsonMode.pipe(llmJsonMode);\n", + "\n", + "const resultFromJsonMode = await chainForJsonMode.invoke({\n", + " input: \"I love programming\",\n", + " language: \"German\",\n", + "});\n", + "\n", + "console.log(resultFromJsonMode);" + ] + }, + { + "cell_type": "markdown", + "id": "9881d422", + "metadata": {}, + "source": [ + "## Multimodal models\n", + "\n", + "Ollama supports open source multimodal models like [LLaVA](https://ollama.ai/library/llava) in versions 0.1.15 and up.\n", + "You can pass images as part of a message's `content` field to multimodal-capable models like this:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "958171d7", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "AIMessage {\n", + " \"content\": \" The image shows a hot dog in a bun, which appears to be a footlong. It has been cooked or grilled to the point where it's browned and possibly has some blackened edges, indicating it might be slightly overcooked. Accompanying the hot dog is a bun that looks toasted as well. There are visible char marks on both the hot dog and the bun, suggesting they have been cooked directly over a source of heat, such as a grill or broiler. The background is white, which puts the focus entirely on the hot dog and its bun. \",\n", + " \"additional_kwargs\": {},\n", + " \"response_metadata\": {\n", + " \"model\": \"llava\",\n", + " \"created_at\": \"2024-08-01T17:25:02.169957Z\",\n", + " \"done_reason\": \"stop\",\n", + " \"done\": true,\n", + " \"total_duration\": 5700249458,\n", + " \"load_duration\": 2543040666,\n", + " \"prompt_eval_count\": 1,\n", + " \"prompt_eval_duration\": 1032591000,\n", + " \"eval_count\": 127,\n", + " \"eval_duration\": 2114201000\n", + " },\n", + " \"tool_calls\": [],\n", + " \"invalid_tool_calls\": [],\n", + " \"usage_metadata\": {\n", + " \"input_tokens\": 1,\n", + " \"output_tokens\": 127,\n", + " \"total_tokens\": 128\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "import { ChatOllama } from \"@langchain/ollama\";\n", + "import { HumanMessage } from \"@langchain/core/messages\";\n", + "import * as fs from \"node:fs/promises\";\n", + "\n", + "const imageData = await fs.readFile(\"../../../../../examples/hotdog.jpg\");\n", + "const llmForMultiModal = new ChatOllama({\n", + " model: \"llava\",\n", + " baseUrl: \"http://127.0.0.1:11434\",\n", + "});\n", + "const multiModalRes = await llmForMultiModal.invoke([\n", + " new HumanMessage({\n", + " content: [\n", + " {\n", + " type: \"text\",\n", + " text: \"What is in this image?\",\n", + " },\n", + " {\n", + " type: \"image_url\",\n", + " image_url: `data:image/jpeg;base64,${imageData.toString(\"base64\")}`,\n", + " },\n", + " ],\n", + " }),\n", + "]);\n", + "console.log(multiModalRes);" + ] + }, + { + "cell_type": "markdown", + "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all ChatOllama features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_ollama.ChatOllama.html" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "TypeScript", + "language": "typescript", + "name": "tslab" + }, + "language_info": { + "codemirror_mode": { + "mode": "typescript", + "name": "javascript", + "typescript": true + }, + "file_extension": ".ts", + "mimetype": "text/typescript", + "name": "typescript", + "version": "3.7.2" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/core_docs/docs/integrations/chat/ollama.mdx b/docs/core_docs/docs/integrations/chat/ollama.mdx deleted file mode 100644 index d7c50b190630..000000000000 --- a/docs/core_docs/docs/integrations/chat/ollama.mdx +++ /dev/null @@ -1,81 +0,0 @@ ---- -sidebar_label: Ollama ---- - -# ChatOllama - -[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally. - -Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage. - -This example goes over how to use LangChain to interact with an Ollama-run Llama 2 7b instance as a chat model. -For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library). - -## Setup - -Follow [these instructions](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance. Then, download the `@langchain/ollama` package. - -import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx"; - - - -```bash npm2yarn -npm install @langchain/ollama -``` - -## Usage - -import CodeBlock from "@theme/CodeBlock"; -import OllamaExample from "@examples/models/chat/integration_ollama.ts"; - -{OllamaExample} - -:::tip -See a LangSmith trace of the above example [here](https://smith.langchain.com/public/c4f3cb1a-4496-40bd-854c-898118b21809/r) -::: - -## Tools - -Ollama now offers support for native tool calling. The example below demonstrates how you can invoke a tool from an Ollama model. - -import OllamaToolsExample from "@examples/models/chat/integration_ollama_tools.ts"; - -{OllamaToolsExample} - -:::tip -You can see the LangSmith trace of the above example [here](https://smith.langchain.com/public/940f4279-6825-4d19-9653-4c50d3c70625/r) -::: - -Since `ChatOllama` supports the `.bindTools()` method, you can also call `.withStructuredOutput()` to get a structured output from the tool. - -import OllamaWSOExample from "@examples/models/chat/integration_ollama_wso.ts"; - -{OllamaWSOExample} - -:::tip -You can see the LangSmith trace of the above example [here](https://smith.langchain.com/public/ed113c53-1299-4814-817e-1157c9eac47e/r) -::: - -## JSON mode - -Ollama also supports a JSON mode that coerces model outputs to only return JSON. Here's an example of how this can be useful for extraction: - -import OllamaJSONModeExample from "@examples/models/chat/integration_ollama_json_mode.ts"; - -{OllamaJSONModeExample} - -:::tip -You can see a simple LangSmith trace of this [here](https://smith.langchain.com/public/54c0430c-0d59-4121-b1ca-34ac1e95e0bb/r) -::: - -## Multimodal models - -Ollama supports open source multimodal models like [LLaVA](https://ollama.ai/library/llava) in versions 0.1.15 and up. -You can pass images as part of a message's `content` field to multimodal-capable models like this: - -import OllamaMultimodalExample from "@examples/models/chat/integration_ollama_multimodal.ts"; - -{OllamaMultimodalExample} - -This will currently not use the image's position within the prompt message as additional information, and will just pass -the image along as context with the rest of the prompt messages.