From 47142eb6ee3ce29dc52c21601a0d118475687b94 Mon Sep 17 00:00:00 2001 From: Rashmi Pawar <168514198+raspawar@users.noreply.github.com> Date: Fri, 4 Oct 2024 05:02:45 +0530 Subject: [PATCH] docs: Integrations NVIDIA llm documentation (#26934) **Description:** Add Notebook for NVIDIA prompt completion llm class. cc: @sumitkbh @mattf @dglogo --------- Co-authored-by: Erick Friis --- .../llms/nvidia_ai_endpoints.ipynb | 309 ++++++++++++++++++ 1 file changed, 309 insertions(+) create mode 100644 docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb diff --git a/docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb b/docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb new file mode 100644 index 0000000000000..190cf8db74d57 --- /dev/null +++ b/docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb @@ -0,0 +1,309 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# NVIDIA\n", + "\n", + "This will help you getting started with NVIDIA [models](/docs/concepts/#llms). For detailed documentation of all `NVIDIA` features and configurations head to the [API reference](https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.chat_models.NVIDIA.html).\n", + "\n", + "## Overview\n", + "The `langchain-nvidia-ai-endpoints` package contains LangChain integrations building applications with models on \n", + "NVIDIA NIM inference microservice. These models are optimized by NVIDIA to deliver the best performance on NVIDIA \n", + "accelerated infrastructure and deployed as a NIM, an easy-to-use, prebuilt containers that deploy anywhere using a single \n", + "command on NVIDIA accelerated infrastructure.\n", + "\n", + "NVIDIA hosted deployments of NIMs are available to test on the [NVIDIA API catalog](https://build.nvidia.com/). After testing, \n", + "NIMs can be exported from NVIDIA’s API catalog using the NVIDIA AI Enterprise license and run on-premises or in the cloud, \n", + "giving enterprises ownership and full control of their IP and AI application.\n", + "\n", + "NIMs are packaged as container images on a per model basis and are distributed as NGC container images through the NVIDIA NGC Catalog. \n", + "At their core, NIMs provide easy, consistent, and familiar APIs for running inference on an AI model.\n", + "\n", + "This example goes over how to use LangChain to interact with NVIDIA supported via the `NVIDIA` class.\n", + "\n", + "For more information on accessing the llm models through this api, check out the [NVIDIA](https://python.langchain.com/docs/integrations/llms/nvidia_ai_endpoints/) documentation.\n", + "\n", + "### Integration details\n", + "\n", + "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [NVIDIA](https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html) | [langchain_nvidia_ai_endpoints](https://python.langchain.com/api_reference/nvidia_ai_endpoints/index.html) | ✅ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_nvidia_ai_endpoints?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_nvidia_ai_endpoints?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "| JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n", + "\n", + "## Setup\n", + "\n", + "**To get started:**\n", + "\n", + "1. Create a free account with [NVIDIA](https://build.nvidia.com/), which hosts NVIDIA AI Foundation models.\n", + "\n", + "2. Click on your model of choice.\n", + "\n", + "3. Under `Input` select the `Python` tab, and click `Get API Key`. Then click `Generate Key`.\n", + "\n", + "4. Copy and save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints.\n", + "\n", + "### Credentials\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"NVIDIA_API_KEY\"):\n", + " # Note: the API key should start with \"nvapi-\"\n", + " os.environ[\"NVIDIA_API_KEY\"] = getpass.getpass(\"Enter your NVIDIA API key: \")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain NVIDIA AI Endpoints integration lives in the `langchain_nvidia_ai_endpoints` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install --upgrade --quiet langchain-nvidia-ai-endpoints" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "See [LLM](/docs/how_to#llms) for full functionality." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_nvidia_ai_endpoints import NVIDIA" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "llm = NVIDIA().bind(max_tokens=256)\n", + "llm" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Invocation" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "prompt = \"# Function that does quicksort written in Rust without comments:\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(llm.invoke(prompt))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Stream, Batch, and Async\n", + "\n", + "These models natively support streaming, and as is the case with all LangChain LLMs they expose a batch method to handle concurrent requests, as well as async methods for invoke, stream, and batch. Below are a few examples." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for chunk in llm.stream(prompt):\n", + " print(chunk, end=\"\", flush=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "llm.batch([prompt])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "await llm.ainvoke(prompt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "async for chunk in llm.astream(prompt):\n", + " print(chunk, end=\"\", flush=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "await llm.abatch([prompt])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "async for chunk in llm.astream_log(prompt):\n", + " print(chunk)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = llm.invoke(\n", + " \"X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.1) #Train a logistic regression model, predict the labels on the test set and compute the accuracy score\"\n", + ")\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Supported models\n", + "\n", + "Querying `available_models` will still give you all of the other models offered by your API credentials." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "NVIDIA.get_available_models()\n", + "# llm.get_available_models()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Chaining\n", + "\n", + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import ChatPromptTemplate\n", + "\n", + "prompt = ChatPromptTemplate(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", + " ]\n", + ")\n", + "\n", + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all `NVIDIA` features and configurations head to the API reference: https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.llms.NVIDIA.html" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "langchain-nvidia-ai-endpoints-m0-Y4aGr-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}