Skip to content

Commit

Permalink
Merge branch 'langchain-ai:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
fzowl authored Dec 19, 2024
2 parents 5eed01f + c823cc5 commit a9782e1
Show file tree
Hide file tree
Showing 57 changed files with 4,339 additions and 2,134 deletions.
129 changes: 129 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
repos:
- repo: local
hooks:
- id: core
name: format core
language: system
entry: make -C libs/core format
files: ^libs/core/
pass_filenames: false
- id: community
name: format community
language: system
entry: make -C libs/community format
files: ^libs/community/
pass_filenames: false
- id: langchain
name: format langchain
language: system
entry: make -C libs/langchain format
files: ^libs/langchain/
pass_filenames: false
- id: standard-tests
name: format standard-tests
language: system
entry: make -C libs/standard-tests format
files: ^libs/standard-tests/
pass_filenames: false
- id: text-splitters
name: format text-splitters
language: system
entry: make -C libs/text-splitters format
files: ^libs/text-splitters/
pass_filenames: false
- id: anthropic
name: format partners/anthropic
language: system
entry: make -C libs/partners/anthropic format
files: ^libs/partners/anthropic/
pass_filenames: false
- id: chroma
name: format partners/chroma
language: system
entry: make -C libs/partners/chroma format
files: ^libs/partners/chroma/
pass_filenames: false
- id: couchbase
name: format partners/couchbase
language: system
entry: make -C libs/partners/couchbase format
files: ^libs/partners/couchbase/
pass_filenames: false
- id: exa
name: format partners/exa
language: system
entry: make -C libs/partners/exa format
files: ^libs/partners/exa/
pass_filenames: false
- id: fireworks
name: format partners/fireworks
language: system
entry: make -C libs/partners/fireworks format
files: ^libs/partners/fireworks/
pass_filenames: false
- id: groq
name: format partners/groq
language: system
entry: make -C libs/partners/groq format
files: ^libs/partners/groq/
pass_filenames: false
- id: huggingface
name: format partners/huggingface
language: system
entry: make -C libs/partners/huggingface format
files: ^libs/partners/huggingface/
pass_filenames: false
- id: mistralai
name: format partners/mistralai
language: system
entry: make -C libs/partners/mistralai format
files: ^libs/partners/mistralai/
pass_filenames: false
- id: nomic
name: format partners/nomic
language: system
entry: make -C libs/partners/nomic format
files: ^libs/partners/nomic/
pass_filenames: false
- id: ollama
name: format partners/ollama
language: system
entry: make -C libs/partners/ollama format
files: ^libs/partners/ollama/
pass_filenames: false
- id: openai
name: format partners/openai
language: system
entry: make -C libs/partners/openai format
files: ^libs/partners/openai/
pass_filenames: false
- id: pinecone
name: format partners/pinecone
language: system
entry: make -C libs/partners/pinecone format
files: ^libs/partners/pinecone/
pass_filenames: false
- id: prompty
name: format partners/prompty
language: system
entry: make -C libs/partners/prompty format
files: ^libs/partners/prompty/
pass_filenames: false
- id: qdrant
name: format partners/qdrant
language: system
entry: make -C libs/partners/qdrant format
files: ^libs/partners/qdrant/
pass_filenames: false
- id: voyageai
name: format partners/voyageai
language: system
entry: make -C libs/partners/voyageai format
files: ^libs/partners/voyageai/
pass_filenames: false
- id: root
name: format docs, cookbook
language: system
entry: make format
files: ^(docs|cookbook)/
pass_filenames: false
6 changes: 5 additions & 1 deletion docs/docs/contributing/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,11 @@ If you are able to help answer questions, please do so! This will allow the main

Our [issues](https://github.com/langchain-ai/langchain/issues) page is kept up to date with bugs, improvements, and feature requests.

There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help organize issues.
There is a [taxonomy of labels](https://github.com/langchain-ai/langchain/labels?sort=count-desc)
to help with sorting and discovery of issues of interest. Please use these to help
organize issues. Check out the [Help Wanted](https://github.com/langchain-ai/langchain/labels/help%20wanted)
and [Good First Issue](https://github.com/langchain-ai/langchain/labels/good%20first%20issue)
tags for recommendations.

If you start working on an issue, please assign it to yourself.

Expand Down
6 changes: 5 additions & 1 deletion docs/docs/integrations/chat/oci_data_science.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -156,6 +156,10 @@
" \"temperature\": 0.2,\n",
" \"max_tokens\": 512,\n",
" }, # other model params...\n",
" default_headers={\n",
" \"route\": \"/v1/chat/completions\",\n",
" # other request headers ...\n",
" },\n",
")"
]
},
Expand Down
10 changes: 5 additions & 5 deletions docs/docs/integrations/chat/oci_generative_ai.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@
"## Overview\n",
"### Integration details\n",
"\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/oci_generative_ai) | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatOCIGenAI](https://python.langchain.com/api_reference/community/chat_models/langchain_community.chat_models.oci_generative_ai.ChatOCIGenAI.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-oci-generative-ai?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-oci-generative-ai?style=flat-square&label=%20) |\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/oci_generative_ai) |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [ChatOCIGenAI](https://python.langchain.com/api_reference/community/chat_models/langchain_community.chat_models.oci_generative_ai.ChatOCIGenAI.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ❌ | ❌ | ❌ |\n",
"\n",
"### Model features\n",
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | [JSON mode](/docs/how_to/structured_output/#advanced-specifying-the-method-for-structuring-outputs) | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
"| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n",
"| ✅ | ✅ | | | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n",
"| ✅ | ✅ | | | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n",
"\n",
"## Setup\n",
"\n",
Expand Down
149 changes: 149 additions & 0 deletions docs/docs/integrations/document_loaders/yt_dlp.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: YoutubeLoaderDL\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# YoutubeLoaderDL\n",
"\n",
"Loader for Youtube leveraging the `yt-dlp` library.\n",
"\n",
"This package implements a [document loader](/docs/concepts/document_loaders/) for Youtube. In contrast to the [YoutubeLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.youtube.YoutubeLoader.html) of `langchain-community`, which relies on `pytube`, `YoutubeLoaderDL` is able to fetch YouTube metadata. `langchain-yt-dlp` leverages the robust `yt-dlp` library, providing a more reliable and feature-rich YouTube document loader.\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Class | Package | Local | Serializable | JS Support |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| YoutubeLoader | langchain-yt-dlp | ✅ | ✅ | ❌ |\n",
"\n",
"## Setup\n",
"\n",
"### Installation\n",
"\n",
"```bash\n",
"pip install langchain-yt-dlp\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialization"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from langchain_yt_dlp.youtube_loader import YoutubeLoaderDL\n",
"\n",
"# Basic transcript loading\n",
"loader = YoutubeLoaderDL.from_youtube_url(\n",
" \"https://www.youtube.com/watch?v=dQw4w9WgXcQ\", add_video_info=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"documents = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'source': 'dQw4w9WgXcQ',\n",
" 'title': 'Rick Astley - Never Gonna Give You Up (Official Music Video)',\n",
" 'description': 'The official video for “Never Gonna Give You Up” by Rick Astley. \\n\\nNever: The Autobiography 📚 OUT NOW! \\nFollow this link to get your copy and listen to Rick’s ‘Never’ playlist ❤️ #RickAstleyNever\\nhttps://linktr.ee/rickastleynever\\n\\n“Never Gonna Give You Up” was a global smash on its release in July 1987, topping the charts in 25 countries including Rick’s native UK and the US Billboard Hot 100. It also won the Brit Award for Best single in 1988. Stock Aitken and Waterman wrote and produced the track which was the lead-off single and lead track from Rick’s debut LP “Whenever You Need Somebody”. The album was itself a UK number one and would go on to sell over 15 million copies worldwide.\\n\\nThe legendary video was directed by Simon West – who later went on to make Hollywood blockbusters such as Con Air, Lara Croft – Tomb Raider and The Expendables 2. The video passed the 1bn YouTube views milestone on 28 July 2021.\\n\\nSubscribe to the official Rick Astley YouTube channel: https://RickAstley.lnk.to/YTSubID\\n\\nFollow Rick Astley:\\nFacebook: https://RickAstley.lnk.to/FBFollowID \\nTwitter: https://RickAstley.lnk.to/TwitterID \\nInstagram: https://RickAstley.lnk.to/InstagramID \\nWebsite: https://RickAstley.lnk.to/storeID \\nTikTok: https://RickAstley.lnk.to/TikTokID\\n\\nListen to Rick Astley:\\nSpotify: https://RickAstley.lnk.to/SpotifyID \\nApple Music: https://RickAstley.lnk.to/AppleMusicID \\nAmazon Music: https://RickAstley.lnk.to/AmazonMusicID \\nDeezer: https://RickAstley.lnk.to/DeezerID \\n\\nLyrics:\\nWe’re no strangers to love\\nYou know the rules and so do I\\nA full commitment’s what I’m thinking of\\nYou wouldn’t get this from any other guy\\n\\nI just wanna tell you how I’m feeling\\nGotta make you understand\\n\\nNever gonna give you up\\nNever gonna let you down\\nNever gonna run around and desert you\\nNever gonna make you cry\\nNever gonna say goodbye\\nNever gonna tell a lie and hurt you\\n\\nWe’ve known each other for so long\\nYour heart’s been aching but you’re too shy to say it\\nInside we both know what’s been going on\\nWe know the game and we’re gonna play it\\n\\nAnd if you ask me how I’m feeling\\nDon’t tell me you’re too blind to see\\n\\nNever gonna give you up\\nNever gonna let you down\\nNever gonna run around and desert you\\nNever gonna make you cry\\nNever gonna say goodbye\\nNever gonna tell a lie and hurt you\\n\\n#RickAstley #NeverGonnaGiveYouUp #WheneverYouNeedSomebody #OfficialMusicVideo',\n",
" 'view_count': 1603360806,\n",
" 'publish_date': datetime.datetime(2009, 10, 25, 0, 0),\n",
" 'length': 212,\n",
" 'author': 'Rick Astley',\n",
" 'channel_id': 'UCuAXFkgsw1L7xaCfnd5JJOw',\n",
" 'webpage_url': 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'}"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents[0].metadata"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lazy Load"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- No lazy loading is implemented"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## API reference:\n",
"\n",
"- [Github](https://github.com/aqib0770/langchain-yt-dlp)\n",
"- [PyPi](https://pypi.org/project/langchain-yt-dlp/)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
31 changes: 31 additions & 0 deletions docs/docs/integrations/providers/oceanbase.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# OceanBase

[OceanBase Database](https://github.com/oceanbase/oceanbase) is a distributed relational database.
It is developed entirely by Ant Group. The OceanBase Database is built on a common server cluster.
Based on the Paxos protocol and its distributed structure, the OceanBase Database provides high availability and linear scalability.

OceanBase currently has the ability to store vectors. Users can easily perform the following operations with SQL:

- Create a table containing vector type fields;
- Create a vector index table based on the HNSW algorithm;
- Perform vector approximate nearest neighbor queries;
- ...

## Installation

```bash
pip install -U langchain-oceanbase
```

We recommend using Docker to deploy OceanBase:

```shell
docker run --name=ob433 -e MODE=slim -p 2881:2881 -d oceanbase/oceanbase-ce:4.3.3.0-100000132024100711
```

[More methods to deploy OceanBase cluster](https://github.com/oceanbase/oceanbase-doc/blob/V4.3.1/en-US/400.deploy/500.deploy-oceanbase-database-community-edition/100.deployment-overview.md)

### Usage

For a more detailed walkthrough of the OceanBase Wrapper, see [this notebook](https://github.com/oceanbase/langchain-oceanbase/blob/main/docs/vectorstores.ipynb)

2 changes: 1 addition & 1 deletion docs/docs/integrations/providers/premai.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ Dense retrieval models typically include:
"repository_id": 1985,
"document_id": 1306,
"chunk_id": 173899,
"document_name": "[D] Difference between sparse and dense informati\u2026",
"document_name": "[D] Difference between sparse and dense information\u2026",
"similarity_score": 0.3209080100059509,
"content": "with the difference or anywhere\nwhere I can read about it?\n\n\n 17 9\n\n\n u/ScotiabankCanada \u2022 Promoted\n\n\n Accelerate your study permit process\n with Scotiabank's Student GIC\n Program. We're here to help you tur\u2026\n\n\n startright.scotiabank.com Learn More\n\n\n Add a Comment\n\n\nSort by: Best\n\n\n DinosParkour \u2022 1y ago\n\n\n Dense Retrieval (DR) m"
}
Expand Down
Loading

0 comments on commit a9782e1

Please sign in to comment.