Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: Add Naver chat model & embeddings #25162

Merged
merged 63 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from 44 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
70d2705
partner: Initialize Naver(hyperclova) package
hyper-clova Jun 26, 2024
d8203b5
partners: add Naver ChatModel
hyper-clova Jun 26, 2024
333c19a
partners: implement NaverChat._agenerate() function
hyper-clova Jun 26, 2024
57415ea
partners: implement NaverChat. _stream() function
hyper-clova Jun 26, 2024
122a96e
partners: implement NaverChat._astream() function
hyper-clova Jun 27, 2024
3d12768
partners: refectoring NaverChat
hyper-clova Jun 27, 2024
bde3b04
partners: add unit test on NaverChat
hyper-clova Jun 27, 2024
c801472
partners: add unit test on NaverChat 2
hyper-clova Jun 27, 2024
215c936
partners: add use tuning model & turn on service app on NaverChat
hyper-clova Jun 28, 2024
d7d4328
partners: add naver embedding
hyper-clova Jul 5, 2024
cd53c76
partners: change class and model name of embedding
hyper-clova Jul 8, 2024
94cf31e
partners: change class and model name of chat
hyper-clova Jul 8, 2024
fee2a91
partners: change seed and timeout default value.
hyper-clova Jul 8, 2024
8105045
partners: change class and model name of chat 2
hyper-clova Jul 9, 2024
46de3b6
docs: add readme.md of langchain-naver.
hyper-clova Jul 9, 2024
6b8e324
docs: update chatModel ipynb file
hyper-clova Jul 10, 2024
a515825
docs: update text_embedding ipynb file
hyper-clova Jul 10, 2024
28ca463
docs: update chat ipynb file 2
hyper-clova Jul 15, 2024
c4b7063
docs: update text_embedding ipynb file 2
hyper-clova Jul 15, 2024
587e8b7
docs: update chat ipynb file 3
hyper-clova Jul 15, 2024
9ad8b5a
partners: change class and model name of chat 3
hyper-clova Jul 15, 2024
61ebb25
partners: update all langchain-naver doc
hyper-clova Jul 15, 2024
a0f885a
partners: update all langchain-naver doc 2
hyper-clova Jul 16, 2024
be803a0
package: move partner package to community package
hyper-clova Aug 1, 2024
8d2e654
package: change import path to community package
hyper-clova Aug 1, 2024
4ead8c0
partners: deprecate existing clova embeddings
hyper-clova Aug 5, 2024
f4e9a68
fix: Integration docs lint for Naver embeddings
hyper-clova Aug 14, 2024
30aba4f
fix: make lint & format & test for naver package
hyper-clova Aug 14, 2024
36aa93a
partners: update integration docs
hyper-clova Aug 21, 2024
331d696
fix: remove result event
hyper-clova Aug 28, 2024
046b2f2
fix: add httpx_sse decorator
hyper-clova Aug 28, 2024
66f7772
fix: add usage_metadata
hyper-clova Aug 29, 2024
1c772ef
fix: change alias of api_key
hyper-clova Aug 29, 2024
2251bd5
fix: update doc
hyper-clova Aug 30, 2024
a5ea304
community[patch]: change root_validator to __init__ on naver package.
hyper-clova Sep 5, 2024
6632871
community[patch]: fix working of None type input
hyper-clova Sep 5, 2024
6c71199
Merge branch 'master' into feat/package-naver-v1
efriis Sep 19, 2024
02c457e
fix: change code by "make lint"
hyper-clova Sep 23, 2024
134d18c
fix: to apply Numeric Constraints & to change default_params from sta…
hyper-clova Sep 24, 2024
0fed7f8
fix: fix to pass test code after apply Numeric Constraints
hyper-clova Sep 24, 2024
9766bbb
Merge branch 'master' into feat/package-naver-v1
efriis Sep 27, 2024
eda9438
fix: fix make lint & test failed (import Self / import httpx_sse / ip…
hyper-clova Sep 30, 2024
1595ac2
Merge branch 'master' into feat/package-naver-v1
efriis Oct 4, 2024
89a724c
Merge branch 'master' into feat/package-naver-v1
efriis Oct 9, 2024
401aa22
community: add httpx-sse in pyproject.toml files
hyper-clova Oct 10, 2024
13d964f
fix: to remove app_id field alias on ClovaXEmbeddings
hyper-clova Oct 10, 2024
0a694a6
community: add httpx-sse in test_dependencies.py
hyper-clova Oct 10, 2024
4390b9a
fix: to change allow_population_by_field_name to populate_by_name Cha…
hyper-clova Oct 10, 2024
a00de50
Merge branch 'master' into feat/package-naver-v1
hyper-clova Oct 10, 2024
6bb2143
docs: update integartions docs on naver package
hyper-clova Oct 10, 2024
abc6731
Merge branch 'master' into feat/package-naver-v1
hyper-clova Oct 21, 2024
0ae096e
Merge branch 'master' into feat/package-naver-v1
vbarda Oct 21, 2024
c60c741
fix: update poetry lock
hyper-clova Oct 22, 2024
8071b4e
Merge branch 'master' into feat/package-naver-v1
vbarda Oct 22, 2024
9c35df7
update lock
vbarda Oct 22, 2024
55c4c3a
Merge branch 'master' into feat/package-naver-v1
vbarda Oct 22, 2024
0a3c795
fix: Update docs/docs/integrations/providers/naver.mdx
hyper-clova Oct 23, 2024
b61cb40
fix: to update on chat/naver.ipynb
hyper-clova Oct 23, 2024
4de196a
fix: change import pydantic & deprecated version on clova.py
hyper-clova Oct 23, 2024
fd982d8
test: add integration test check chat model
hyper-clova Oct 23, 2024
e2edb07
fix: model_name field alias to validation_alias
hyper-clova Oct 23, 2024
947ae51
Merge branch 'master' into feat/package-naver-v1
vbarda Oct 23, 2024
b6124d5
fix: change BaseMessage to AIMessage
hyper-clova Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
408 changes: 408 additions & 0 deletions docs/docs/integrations/chat/naver.ipynb

Large diffs are not rendered by default.

38 changes: 38 additions & 0 deletions docs/docs/integrations/providers/naver.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# NAVER

All functionality related to `Naver` including HyperCLOVA X models, especially via `Naver Cloud` [CLOVA Studio](https://clovastudio.ncloud.com/).
hyper-clova marked this conversation as resolved.
Show resolved Hide resolved

> [Naver](https://navercorp.com/) is a global technology company with cutting-edge technologies and a diverse business portfolio including search, commerce, fintech, content, cloud, and AI.

> [Naver Cloud](https://www.navercloudcorp.com/lang/en/) is the cloud computing arm of Naver, a leading cloud service provider offering a comprehensive suite of cloud services to businesses through its [Naver Cloud Platform (NCP)](https://www.ncloud.com/).

Please refer to [NCP User Guide](https://guide.ncloud-docs.com/docs/clovastudio-overview) for more detailed instructions (also in Korean).

## Installation and Setup

- Get both CLOVA Studio API Key and API Gateway Key by [creating your app](https://guide.ncloud-docs.com/docs/en/clovastudio-playground01#create-test-app) and set them as environment variables respectively (`NCP_CLOVASTUDIO_API_KEY`, `NCP_APIGW_API_KEY`).
- Install the integration Python package with:

```bash
pip install -U langchain-community
```

## Chat models

### ChatClovaX

See a [usage example](/docs/integrations/chat/naver).

```python
from langchain_community.chat_models import ChatClovaX
```

## Embedding models

### ClovaXEmbeddings

See a [usage example](/docs/integrations/text_embedding/naver).

```python
from langchain_community.embeddings import ClovaXEmbeddings
```
312 changes: 312 additions & 0 deletions docs/docs/integrations/text_embedding/naver.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,312 @@
{
"cells": [
{
"cell_type": "raw",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Naver\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {},
"source": [
"# ClovaXEmbeddings\n",
"\n",
"This notebook covers how to get started with embedding models provided by CLOVA Studio. For detailed documentation on `ClovaXEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/latest/api_reference/community/embeddings/langchain_community.embeddings.naver.ClovaXEmbeddings.html).\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Provider | Package |\n",
"|:--------:|:-------:|\n",
"| [Naver](/docs/integrations/providers/naver.mdx) | [ClovaXEmbeddings](https://python.langchain.com/latest/api_reference/community/embeddings/langchain_community.embeddings.naver.ClovaXEmbeddings.html) |\n",
"\n",
"## Setup\n",
"\n",
"Before using embedding models provided by CLOVA Studio, you must go through the three steps below.\n",
"\n",
"1. Creating [NAVER Cloud Platform](https://www.ncloud.com/) account \n",
"2. Apply to use [CLOVA Studio](https://www.ncloud.com/product/aiService/clovaStudio)\n",
"3. Find API Keys after creating CLOVA Studio Test App or Service App (See [here](https://guide.ncloud-docs.com/docs/en/clovastudio-playground01#테스트앱생성).)\n",
"\n",
"### Credentials\n",
"\n",
"CLOVA Studio requires 3 keys (`NCP_CLOVASTUDIO_API_KEY`, `NCP_APIGW_API_KEY` and `NCP_CLOVASTUDIO_APP_ID`) for embeddings.\n",
"- `NCP_CLOVASTUDIO_API_KEY` and `NCP_CLOVASTUDIO_APP_ID` is issued per serviceApp or testApp\n",
"- `NCP_APIGW_API_KEY` is issued per account\n",
"\n",
"The two API Keys could be found by clicking `App Request Status` > `Service App, Test App List` > `‘Details’ button for each app` in [CLOVA Studio](https://clovastudio.ncloud.com/studio-application/service-app)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c52e8a50-3e67-4272-bc80-3954d98f8dea",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"NCP_CLOVASTUDIO_API_KEY\"] = getpass.getpass(\"NCP CLOVA Studio API Key: \")\n",
"os.environ[\"NCP_APIGW_API_KEY\"] = getpass.getpass(\"NCP API Gateway API Key: \")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "83520d8e-ecf8-4e47-b3bc-1ac205b3a2ab",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"NCP_CLOVASTUDIO_APP_ID\"] = input(\"NCP CLOVA Studio App ID: \")"
]
},
{
"cell_type": "markdown",
"id": "ff00653e",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"ClovaXEmbeddings integration lives in the `langchain_community` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "99400c9b",
"metadata": {},
"outputs": [],
"source": [
"# install package\n",
"!pip install -U langchain-community"
]
},
{
"cell_type": "markdown",
"id": "2651e611-9d5b-4315-9bbd-f99f56be4e19",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Now we can instantiate our embeddings object and embed query or document:\n",
"\n",
"- There are several embedding models available in CLOVA Studio. Please refer [here](https://guide.ncloud-docs.com/docs/en/clovastudio-explorer03#임베딩API) for further details.\n",
"- Note that you might need to normalize the embeddings depending on your specific use case."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62e0dbc3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain_community.embeddings import ClovaXEmbeddings\n",
"\n",
"embeddings = ClovaXEmbeddings(\n",
" # model=\"clir-emb-dolphin\" #default is `clir-emb-dolphin`. change with the model name of corresponding App ID if needed.\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0493b4a8",
"metadata": {},
"source": [
"## Indexing and Retrieval\n",
"\n",
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n",
"\n",
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d4d59653",
"metadata": {},
"outputs": [],
"source": [
"# Create a vector store with a sample text\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"\n",
"text = \"CLOVA Studio is an AI development tool that allows you to customize your own HyperCLOVA X models.\"\n",
"\n",
"vectorstore = InMemoryVectorStore.from_texts(\n",
" [text],\n",
" embedding=embeddings,\n",
")\n",
"\n",
"# Use the vectorstore as a retriever\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
"# Retrieve the most similar text\n",
"retrieved_documents = retriever.invoke(\"What is CLOVA Studio?\")\n",
"\n",
"# show the retrieved document's content\n",
"retrieved_documents[0].page_content"
]
},
{
"cell_type": "markdown",
"id": "b1a249e1",
"metadata": {},
"source": [
"## Direct Usage\n",
"\n",
"Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n",
"\n",
"You can directly call these methods to get embeddings for your own use cases.\n",
"\n",
"### Embed single texts\n",
"\n",
"You can embed single texts or documents with `embed_query`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "12fcfb4b",
"metadata": {},
"outputs": [],
"source": [
"embeddings.embed_query(\"My query to look up\")"
]
},
{
"cell_type": "markdown",
"id": "8b383b53",
"metadata": {},
"source": [
"### Embed multiple texts\n",
"\n",
"You can embed multiple texts with `embed_documents`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f2e6104",
"metadata": {},
"outputs": [],
"source": [
"embeddings.embed_documents(\n",
" [\"This is a content of the document\", \"This is another document\"]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "464a4aae",
"metadata": {},
"source": [
"### Embed with async\n",
"\n",
"There are also async functionalities:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46739f68",
"metadata": {},
"outputs": [],
"source": [
"# async embed query\n",
"await embeddings.aembed_query(\"My query to look up\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e48632ea",
"metadata": {},
"outputs": [],
"source": [
"# async embed documents\n",
"await embeddings.aembed_documents(\n",
" [\"This is a content of the document\", \"This is another document\"]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "eee40d32367cc5c4",
"metadata": {},
"source": [
"## Additional functionalities\n",
"\n",
"### Service App\n",
"\n",
"When going live with production-level application using CLOVA Studio, you should apply for and use Service App. (See [here](https://guide.ncloud-docs.com/docs/en/clovastudio-playground01#서비스앱신청).)\n",
"\n",
"For a Service App, corresponding `NCP_CLOVASTUDIO_API_KEY` and `NCP_CLOVASTUDIO_APP_ID` are issued and can only be called with the API Keys."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "08f9f44e-c6a4-4163-8caf-27a0cda345b7",
"metadata": {},
"outputs": [],
"source": [
"#### Update environment variables\n",
"\n",
"os.environ[\"NCP_CLOVASTUDIO_API_KEY\"] = getpass.getpass(\n",
" \"NCP CLOVA Studio API Key for Service App: \"\n",
")\n",
"os.environ[\"NCP_CLOVASTUDIO_APP_ID\"] = input(\"NCP CLOVA Studio Service App ID: \")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "86f59698-b3f4-4b19-a9d4-4facfcea304b",
"metadata": {},
"outputs": [],
"source": [
"embeddings = ClovaXEmbeddings(service_app=True)"
]
},
{
"cell_type": "markdown",
"id": "1ddeaee9",
"metadata": {},
"source": [
"## API Reference\n",
"\n",
"For detailed documentation on `ClovaXEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/latest/api_reference/community/embeddings/langchain_community.embeddings.naver.ClovaXEmbeddings.html)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
5 changes: 5 additions & 0 deletions libs/community/langchain_community/chat_models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@
from langchain_community.chat_models.moonshot import (
MoonshotChat,
)
from langchain_community.chat_models.naver import (
ChatClovaX,
)
from langchain_community.chat_models.oci_generative_ai import (
ChatOCIGenAI, # noqa: F401
)
Expand Down Expand Up @@ -188,6 +191,7 @@
"ChatAnthropic",
"ChatAnyscale",
"ChatBaichuan",
"ChatClovaX",
"ChatCohere",
"ChatCoze",
"ChatOctoAI",
Expand Down Expand Up @@ -249,6 +253,7 @@
"ChatAnthropic": "langchain_community.chat_models.anthropic",
"ChatAnyscale": "langchain_community.chat_models.anyscale",
"ChatBaichuan": "langchain_community.chat_models.baichuan",
"ChatClovaX": "langchain_community.chat_models.naver",
"ChatCohere": "langchain_community.chat_models.cohere",
"ChatCoze": "langchain_community.chat_models.coze",
"ChatDatabricks": "langchain_community.chat_models.databricks",
Expand Down
Loading
Loading