Skip to content

Commit

Permalink
exa: init pkg (#16553)
Browse files Browse the repository at this point in the history
  • Loading branch information
efriis authored Jan 25, 2024
1 parent c4e9c9c commit adc0084
Show file tree
Hide file tree
Showing 27 changed files with 1,521 additions and 25 deletions.
1 change: 1 addition & 0 deletions .github/workflows/_integration_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ jobs:
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
run: |
make integration_tests
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ jobs:
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
run: make integration_tests
working-directory: ${{ inputs.working-directory }}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "4x4kQ0VcodAC"
},
"source": [
"# Metaphor Search"
"# Exa Search"
]
},
{
Expand All @@ -15,13 +15,13 @@
"id": "V1x8wEUhodAH"
},
"source": [
"Metaphor is a search engine fully designed for use by LLMs. Search for documents on the internet using **natural language queries**, then retrieve **cleaned HTML content** from desired documents.\n",
"Exa (formerly Metaphor Search) is a search engine fully designed for use by LLMs. Search for documents on the internet using **natural language queries**, then retrieve **cleaned HTML content** from desired documents.\n",
"\n",
"Unlike keyword-based search (Google), Metaphor's neural search capabilities allow it to semantically understand queries and return relevant documents. For example, we could search `\"fascinating article about cats\"` and compare the search results from [Google](https://www.google.com/search?q=fascinating+article+about+cats) and [Metaphor](https://metaphor.systems/search?q=fascinating%20article%20about%20cats&autopromptString=Here%20is%20a%20fascinating%20article%20about%20cats%3A). Google gives us SEO-optimized listicles based on the keyword \"fascinating\". Metaphor just works.\n",
"Unlike keyword-based search (Google), Exa's neural search capabilities allow it to semantically understand queries and return relevant documents. For example, we could search `\"fascinating article about cats\"` and compare the search results from [Google](https://www.google.com/search?q=fascinating+article+about+cats) and [Exa](https://search.exa.ai/search?q=fascinating%20article%20about%20cats&autopromptString=Here%20is%20a%20fascinating%20article%20about%20cats%3A). Google gives us SEO-optimized listicles based on the keyword \"fascinating\". Exa just works.\n",
"\n",
"This notebook goes over how to use Metaphor Search with LangChain.\n",
"This notebook goes over how to use Exa Search with LangChain.\n",
"\n",
"First, get a Metaphor API key and add it as an environment variable. Get 1000 free searches/month by [signing up here](https://platform.metaphor.systems/)."
"First, get an Exa API key and add it as an environment variable. Get 1000 free searches/month by [signing up here](https://dashboard.exa.ai/)."
]
},
{
Expand All @@ -34,7 +34,88 @@
"source": [
"import os\n",
"\n",
"os.environ[\"METAPHOR_API_KEY\"] = \"...\""
"os.environ[\"EXA_API_KEY\"] = \"...\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And install the integration package"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain-exa\n",
"\n",
"# and some deps for this notebook\n",
"%pip install --upgrade --quiet langchain langchain-openai"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using ExaSearchRetriever\n",
"\n",
"ExaSearchRetriever is a retriever that uses Exa Search to retrieve relevant documents."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Result(title='Find Us:', url='https://travelila.com/best-time-to-visit-japan/', id='UFLQGtanQffaDErhngnzgA', score=0.1865834891796112, published_date='2021-01-05', author=None, text='If you are planning to spend your next vacation in Japan, then hold your excitement a bit. It would help if you planned which places you will visit in Japan and the country’s best things. It’s entirel', highlights=None, highlight_scores=None), Result(title='When Is The Best Time of Year To Visit Japan?', url='https://boutiquejapan.com/when-is-the-best-time-of-year-to-visit-japan/', id='70b0IMuaQpshjpBpnwsfUg', score=0.17796635627746582, published_date='2022-09-26', author='Andres Zuleta', text='The good news for travelers is that there is no single best time of year to travel to Japan — yet this makes it hard to decide when to visit, as each season has its own special highlights.When plannin', highlights=None, highlight_scores=None), Result(title='Here is the Best Time to Visit Japan - Cooking Sun', url='https://www.cooking-sun.com/best-time-to-visit-japan/', id='2mh-xvoqGPT-ZRvX9GezNQ', score=0.17497511208057404, published_date='2018-12-17', author='Cooking Sun', text='Japan is a diverse and beautiful country that’s brimming with culture. For some travelers, visiting Japan is a dream come true, since it grazes bucket lists across the globe. One of the best parts abo', highlights=None, highlight_scores=None), Result(title='When to Visit Japan? Bests Times and 2023 Travel Tips', url='https://www.jrailpass.com/blog/when-visit-japan-times', id='KqCnY8fF-nc76n1wNpIo1Q', score=0.17359933257102966, published_date='2020-02-18', author='JRailPass', text='When is the best time to visit Japan? This is a question without a simple answer. Japan is a year-round destination, with interesting activities, attractions, and festivities throughout the year.Your ', highlights=None, highlight_scores=None), Result(title='Complete Guide To Visiting Japan In February 2023: Weather, What To See & Do | LIVE JAPAN travel guide', url='https://livejapan.com/en/article-a0002948/', id='i3nmekOdM8_VBxPfcJmxng', score=0.17215865850448608, published_date='2019-11-13', author='Lucio Maurizi', text='\\n \\n \\n HOME\\n Complete Guide To Visiting Japan In February 2023: Weather, What To See & Do\\n \\n \\n \\n \\n \\n \\n Date published: 13 November 2019 \\n Last updated: 26 January 2021 \\n \\n \\n So you’re planning your tra', highlights=None, highlight_scores=None)]\n"
]
},
{
"data": {
"text/plain": [
"AIMessage(content='Based on the given context, there is no specific best time mentioned to visit Japan. Each season has its own special highlights, and Japan is a year-round destination with interesting activities, attractions, and festivities throughout the year. Therefore, the best time to visit Japan depends on personal preferences and the specific activities or events one wants to experience.')"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.runnables import RunnableParallel, RunnablePassthrough\n",
"from langchain_exa import ExaSearchRetriever, TextContentsOptions\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"# retrieve 5 documents, with content truncated at 1000 characters\n",
"retriever = ExaSearchRetriever(\n",
" k=5, text_contents_options=TextContentsOptions(max_length=200)\n",
")\n",
"\n",
"prompt = PromptTemplate.from_template(\n",
" \"\"\"Answer the following query based on the following context:\n",
"query: {query}\n",
"<context>\n",
"{context}\n",
"</context\"\"\"\n",
")\n",
"\n",
"llm = ChatOpenAI()\n",
"\n",
"chain = (\n",
" RunnableParallel({\"context\": retriever, \"query\": RunnablePassthrough()})\n",
" | prompt\n",
" | llm\n",
")\n",
"\n",
"chain.invoke(\"When is the best time to visit japan?\")"
]
},
{
Expand All @@ -43,14 +124,14 @@
"id": "ip5_D9MkodAK"
},
"source": [
"## Using the Metaphor SDK as LangChain Agent Tools\n",
"## Using the Exa SDK as LangChain Agent Tools\n",
"\n",
"The [Metaphor SDK](https://docs.metaphor.systems/) creates a client that can use the Metaphor API to perform three functions:\n",
"The [Exa SDK](https://docs.exa.ai/) creates a client that can use the Exa API to perform three functions:\n",
"- `search`: Given a natural language search query, retrieve a list of search results.\n",
"- `find_similar`: Given a URL, retrieve a list of search results corresponding to webpages which are similar to the document at the provided URL.\n",
"- `get_content`: Given a list of document ids fetched from `search` or `find_similar`, get cleaned HTML content for each document.\n",
"\n",
"We can use the `@tool` decorator and docstrings to create LangChain Tool wrappers that tell an LLM agent how to use Metaphor."
"We can use the `@tool` decorator and docstrings to create LangChain Tool wrappers that tell an LLM agent how to use Exa."
]
},
{
Expand All @@ -61,7 +142,7 @@
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet metaphor-python"
"%pip install --upgrade --quiet langchain-exa"
]
},
{
Expand All @@ -72,32 +153,32 @@
},
"outputs": [],
"source": [
"from exa_py import Exa\n",
"from langchain.agents import tool\n",
"from metaphor_python import Metaphor\n",
"\n",
"metaphor = Metaphor(api_key=os.environ[\"METAPHOR_API_KEY\"])\n",
"exa = Exa(api_key=os.environ[\"EXA_API_KEY\"])\n",
"\n",
"\n",
"@tool\n",
"def search(query: str):\n",
" \"\"\"Search for a webpage based on the query.\"\"\"\n",
" return metaphor.search(f\"{query}\", use_autoprompt=True, num_results=5)\n",
" return exa.search(f\"{query}\", use_autoprompt=True, num_results=5)\n",
"\n",
"\n",
"@tool\n",
"def find_similar(url: str):\n",
" \"\"\"Search for webpages similar to a given URL.\n",
" The url passed in should be a URL returned from `search`.\n",
" \"\"\"\n",
" return metaphor.find_similar(url, num_results=5)\n",
" return exa.find_similar(url, num_results=5)\n",
"\n",
"\n",
"@tool\n",
"def get_contents(ids: list[str]):\n",
" \"\"\"Get the contents of a webpage.\n",
" The ids passed in should be a list of ids returned from `search`.\n",
" \"\"\"\n",
" return metaphor.get_contents(ids)\n",
" return exa.get_contents(ids)\n",
"\n",
"\n",
"tools = [search, get_contents, find_similar]"
Expand All @@ -109,9 +190,9 @@
"id": "sVe2ca9OodAO"
},
"source": [
"### Providing Metaphor Tools to an Agent\n",
"### Providing Exa Tools to an Agent\n",
"\n",
"We can provide the Metaphor tools we just created to a LangChain `OpenAIFunctionsAgent`. When asked to `Summarize for me a fascinating article about cats`, the agent uses the `search` tool to perform a Metaphor search with an appropriate search query, uses the `get_contents` tool to perform Metaphor content retrieval, and then returns a summary of the retrieved content."
"We can provide the Exa tools we just created to a LangChain `OpenAIFunctionsAgent`. When asked to `Summarize for me a fascinating article about cats`, the agent uses the `search` tool to perform a Exa search with an appropriate search query, uses the `get_contents` tool to perform Exa content retrieval, and then returns a summary of the retrieved content."
]
},
{
Expand Down Expand Up @@ -237,9 +318,11 @@
"id": "e3FHjxT-RoIH"
},
"source": [
"## Advanced Metaphor Features\n",
"## Advanced Exa Features\n",
"\n",
"Exa supports powerful filters by domain and date. We can provide a more powerful `search` tool to the agent that lets it decide to apply filters if they are useful for the objective. See all of Exa's search features [here](https://github.com/metaphorsystems/metaphor-python/).\n",
"\n",
"Metaphor supports powerful filters by domain and date. We can provide a more powerful `search` tool to the agent that lets it decide to apply filters if they are useful for the objective. See all of Metaphor's search features [here](https://github.com/metaphorsystems/metaphor-python/)."
"[//]: # \"TODO(erick): switch metaphor github link to exa github link when sdk published\""
]
},
{
Expand All @@ -250,10 +333,10 @@
},
"outputs": [],
"source": [
"from exa_py import Exa\n",
"from langchain.agents import tool\n",
"from metaphor_python import Metaphor\n",
"\n",
"metaphor = Metaphor(api_key=os.environ[\"METAPHOR_API_KEY\"])\n",
"exa = Exa(api_key=os.environ[\"Exa_API_KEY\"])\n",
"\n",
"\n",
"@tool\n",
Expand All @@ -262,7 +345,7 @@
" Set the optional include_domains (list[str]) parameter to restrict the search to a list of domains.\n",
" Set the optional start_published_date (str) parameter to restrict the search to documents published after the date (YYYY-MM-DD).\n",
" \"\"\"\n",
" return metaphor.search(\n",
" return exa.search(\n",
" f\"{query}\",\n",
" use_autoprompt=True,\n",
" num_results=5,\n",
Expand All @@ -276,15 +359,15 @@
" \"\"\"Search for webpages similar to a given URL.\n",
" The url passed in should be a URL returned from `search`.\n",
" \"\"\"\n",
" return metaphor.find_similar(url, num_results=5)\n",
" return exa.find_similar(url, num_results=5)\n",
"\n",
"\n",
"@tool\n",
"def get_contents(ids: list[str]):\n",
" \"\"\"Get the contents of a webpage.\n",
" The ids passed in should be a list of ids returned from `search`.\n",
" \"\"\"\n",
" return metaphor.get_contents(ids)\n",
" return exa.get_contents(ids)\n",
"\n",
"\n",
"tools = [search, get_contents, find_similar]"
Expand Down Expand Up @@ -449,7 +532,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
4 changes: 4 additions & 0 deletions docs/vercel.json
Original file line number Diff line number Diff line change
Expand Up @@ -3715,6 +3715,10 @@
{
"source": "/docs/integrations/providers/google_document_ai",
"destination": "/docs/integrations/platforms/google#google-document-ai"
},
{
"source": "/docs/integrations/tools/metaphor_search",
"destination": "/docs/integrations/tools/exa_search"
}
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from typing import Dict, List, Optional, Union

from langchain_core._api.deprecation import deprecated
from langchain_core.callbacks import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
Expand All @@ -11,6 +12,11 @@
from langchain_community.utilities.metaphor_search import MetaphorSearchAPIWrapper


@deprecated(
since="0.0.15",
removal="0.2.0",
alternative="langchain_exa.ExaSearchResults",
)
class MetaphorSearchResults(BaseTool):
"""Tool that queries the Metaphor Search API and gets back json."""

Expand Down
1 change: 1 addition & 0 deletions libs/partners/exa/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__pycache__
21 changes: 21 additions & 0 deletions libs/partners/exa/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 LangChain, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
61 changes: 61 additions & 0 deletions libs/partners/exa/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
.PHONY: all format lint test tests integration_tests docker_tests help extended_tests

# Default target executed when no arguments are given to make.
all: help

# Define a variable for the test file path.
TEST_FILE ?= tests/unit_tests/

integration_tests: TEST_FILE=tests/integration_tests/

test integration_tests:
poetry run pytest $(TEST_FILE)

tests:
poetry run pytest $(TEST_FILE)


######################
# LINTING AND FORMATTING
######################

# Define a variable for Python and notebook files.
PYTHON_FILES=.
MYPY_CACHE=.mypy_cache
lint format: PYTHON_FILES=.
lint_diff format_diff: PYTHON_FILES=$(shell git diff --relative=libs/partners/exa --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
lint_package: PYTHON_FILES=langchain_exa
lint_tests: PYTHON_FILES=tests
lint_tests: MYPY_CACHE=.mypy_cache_test

lint lint_diff lint_package lint_tests:
poetry run ruff .
poetry run ruff format $(PYTHON_FILES) --diff
poetry run ruff --select I $(PYTHON_FILES)
mkdir $(MYPY_CACHE); poetry run mypy $(PYTHON_FILES) --cache-dir $(MYPY_CACHE)

format format_diff:
poetry run ruff format $(PYTHON_FILES)
poetry run ruff --select I --fix $(PYTHON_FILES)

spell_check:
poetry run codespell --toml pyproject.toml

spell_fix:
poetry run codespell --toml pyproject.toml -w

check_imports: $(shell find langchain_exa -name '*.py')
poetry run python ./scripts/check_imports.py $^

######################
# HELP
######################

help:
@echo '----'
@echo 'check_imports - check imports'
@echo 'format - run code formatters'
@echo 'lint - run linters'
@echo 'test - run unit tests'
@echo 'tests - run unit tests'
@echo 'test TEST_FILE=<test_file> - run all tests in file'
1 change: 1 addition & 0 deletions libs/partners/exa/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# langchain-exa
Loading

0 comments on commit adc0084

Please sign in to comment.