Skypoint v0.0.351 #5

arunraja1 · 2024-01-02T09:08:18Z

Downmerge version 0.0.351 of langchain to skypoint-langchain

All these changes are from version 0.0.351 release of langchain

…14125) - **Description:** to support not only publicly available Hugging Face endpoints, but also protected ones (created with "Inference Endpoints" Hugging Face feature), I have added ability to specify custom api_url. But if not specified, default behaviour won't change - **Issue:** langchain-ai#9181, - **Dependencies:** no extra dependencies

Add option to override input_type for cohere's v3 embeddings models --------- Co-authored-by: Bagatur <[email protected]>

### Description Starting from [openai version 1.0.0](https://github.com/openai/openai-python/tree/17ac6779958b2b74999c634c4ea4c7b74906027a#module-level-client), the camel case form of `openai.ChatCompletion` is no longer supported and has been changed to lowercase `openai.chat.completions`. In addition, the returned object only accepts attribute access instead of index access: ```python import openai # optional; defaults to `os.environ['OPENAI_API_KEY']` openai.api_key = '...' # all client options can be configured just like the `OpenAI` instantiation counterpart openai.base_url = "https://..." openai.default_headers = {"x-foo": "true"} completion = openai.chat.completions.create( model="gpt-4", messages=[ { "role": "user", "content": "How do I output all files in a directory using Python?", }, ], ) print(completion.choices[0].message.content) ``` So I implemented a compatible adapter that supports both attribute access and index access: ```python In [1]: from langchain.adapters import openai as lc_openai ...: messages = [{"role": "user", "content": "hi"}] In [2]: result = lc_openai.chat.completions.create( ...: messages=messages, model="gpt-3.5-turbo", temperature=0 ...: ) In [3]: result.choices[0].message Out[3]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'} In [4]: result["choices"][0]["message"] Out[4]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'} In [5]: result = await lc_openai.chat.completions.acreate( ...: messages=messages, model="gpt-3.5-turbo", temperature=0 ...: ) In [6]: result.choices[0].message Out[6]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'} In [7]: result["choices"][0]["message"] Out[7]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'} In [8]: for rs in lc_openai.chat.completions.create( ...: messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True ...: ): ...: print(rs.choices[0].delta) ...: print(rs["choices"][0]["delta"]) ...: {'role': 'assistant', 'content': ''} {'role': 'assistant', 'content': ''} {'content': 'Hello'} {'content': 'Hello'} {'content': '!'} {'content': '!'} In [20]: async for rs in await lc_openai.chat.completions.acreate( ...: messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True ...: ): ...: print(rs.choices[0].delta) ...: print(rs["choices"][0]["delta"]) ...: {'role': 'assistant', 'content': ''} {'role': 'assistant', 'content': ''} {'content': 'Hello'} {'content': 'Hello'} {'content': '!'} {'content': '!'} ... ``` ### Twitter handle [lin_bob57617](https://twitter.com/lin_bob57617)

@baskaryan

- **Description:** Our PR is an integration of a Steam API Tool that makes recommendations on steam games based on user's Steam profile and provides information on games based on user provided queries. - **Issue:** the issue # our PR implements: langchain-ai#12120 - **Dependencies:** python-steam-api library, steamspypi library and decouple library - **Tag maintainer:** @baskaryan, @hwchase17 - **Twitter handle:** N/A Hello langchain Maintainers, We are a team of 4 University of Toronto students contributing to langchain as part of our course [CSCD01 (link to course page)](https://cscd01.com/work/open-source-project). We hope our changes help the community. We have run make format, make lint and make test locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Our PR integrates the python-steam-api, steamspypi and decouple packages. We have added integration tests to test our python API integration into langchain and an example notebook is also provided. Our amazing team that contributed to this PR: @JohnY2002, @shenceyang, @andrewqian2001 and @muntaqamahmood Thank you in advance to all the maintainers for reviewing our PR! --------- Co-authored-by: Shence <[email protected]> Co-authored-by: JohnY2002 <[email protected]> Co-authored-by: Andrew Qian <[email protected]> Co-authored-by: Harrison Chase <[email protected]> Co-authored-by: JohnY <[email protected]>

…ain-ai#13966) ### **Description** Hi, I just started learning the source code of `langchain` and hope to contribute code. However, according to the instructions in the [CONTRIBUTING.md](https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md) document, I could not run the test command `make test` to run normally. I found that many modules did not exist after [splitting `langchain_core`](langchain-ai#13823), so I updated the document. ### **Twitter handle** lin_bob57617

**Description:** Adds the document loader for [Couchbase](http://couchbase.com/), a distributed NoSQL database. **Dependencies:** Added the Couchbase SDK as an optional dependency. **Twitter handle:** nithishr --------- Co-authored-by: Bagatur <[email protected]>

Switches to a more maintained solution for building ipynb -> md files (`quarto`) Also bumps us down to python3.8 because it's significantly faster in the vercel build step. Uses default openssl version instead of upgrading as well.

@hwchase17

- **Description:** Obsidian templates can include [variables](https://help.obsidian.md/Plugins/Templates#Template+variables) using double curly braces. `ObsidianLoader` uses PyYaml to parse the frontmatter of documents. This parsing throws an error when encountering variables' curly braces. This is avoided by temporarily substituting safe strings before parsing. - **Issue:** langchain-ai#13887 - **Tag maintainer:** @hwchase17

@baskaryan

…n-ai#13695)  --------- Co-authored-by: Nicholas Ceccarelli <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

Co-authored-by: Jacob Matias <[email protected]> Co-authored-by: Karam Daid <[email protected]> Co-authored-by: Jumana <[email protected]> Co-authored-by: KaramDaid <[email protected]> Co-authored-by: Anna Chester <[email protected]> Co-authored-by: Jumana <[email protected]>

@rlancemartin

langchain-ai#14201) If we are not going to make the existing Docstore class also implement `BaseStore[str, Document]`, IMO all base store implementations should always be `[str, bytes]` so that they are more interchangeable. CC @rlancemartin @eyurtsev

…le (langchain-ai#14230) **Description:** When a RunnableLambda only receives a synchronous callback, this callback is wrapped into an async one since langchain-ai#13408. However, this wrapping with `(*args, **kwargs)` causes the `accepts_config` check at [/libs/core/langchain_core/runnables/config.py#L342](https://github.com/langchain-ai/langchain/blob/ee94ef55ee6ab064da08340817955f821dfa6261/libs/core/langchain_core/runnables/config.py#L342) to fail, as this checks for the presence of a "config" argument in the method signature. Adding a `functools.wraps` around it, resolves it.

@rlancemartin

…hain-ai#14202) Allow users to pass a generic `BaseStore[str, bytes]` to MultiVectorRetriever, removing the need to use the `create_kv_docstore` method. This encoding will now happen internally. @rlancemartin @eyurtsev --------- Co-authored-by: Eugene Yurtsev <[email protected]>

The `/docs/integrations/toolkits/vectorstore` page is not the Integration page. The best place is in `/docs/modules/agents/how_to/` - Moved the file - Rerouted the page URL

# Dependencies None # Twitter handle @HKydlicek --------- Co-authored-by: Erick Friis <[email protected]>

@baskaryan

…ts (langchain-ai#9027) The Github utilities are fantastic, so I'm adding support for deeper interaction with pull requests. Agents should read "regular" comments and review comments, and the content of PR files (with summarization or `ctags` abbreviations). Progress: - [x] Add functions to read pull requests and the full content of modified files. - [x] Function to use Github's built in code / issues search. Out of scope: - Smarter summarization of file contents of large pull requests (`tree` output, or ctags). - Smarter functions to checkout PRs and edit the files incrementally before bulk committing all changes. - Docs example for creating two agents: - One watches issues: For every new issue, open a PR with your best attempt at fixing it. - The other watches PRs: For every new PR && every new comment on a PR, check the status and try to finish the job.  --------- Co-authored-by: Erick Friis <[email protected]>

Running a large number of requests to Embaas' servers (or any server) can result in intermittent network failures (both from local and external network/service issues). This PR implements exponential backoff retries to help mitigate this issue.

@baskaryan

- **Description:** fixed the transform_input method in the example., - **Issue:** example didn't work, - **Dependencies:** None, - **Tag maintainer:** @baskaryan, - **Twitter handle:** @Ravidhu87

Co-authored-by: SebastjanPrachovskij <[email protected]>

Hi! I'm Alex, Python SDK Team Lead from [Comet](https://www.comet.com/site/). This PR contains our new integration between langchain and Comet - `CometTracer` class which uses new `comet_llm` python package for submitting data to Comet. No additional dependencies for the langchain package are required directly, but if the user wants to use `CometTracer`, `comet-llm>=2.0.0` should be installed. Otherwise an exception will be raised from `CometTracer.__init__`. A test for the feature is included. There is also an already existing callback (and .ipynb file with example) which ideally should be deprecated in favor of a new tracer. I wasn't sure how exactly you'd prefer to do it. For example we could open a separate PR for that. I'm open to your ideas :)

@baskaryan

- **Description:** Bugfix duckduckgo_search news search - **Issue:** langchain-ai#13648 - **Dependencies:** None - **Tag maintainer:** @baskaryan --------- Co-authored-by: Harrison Chase <[email protected]>

@baskaryan

…chain-ai#13619) **Description** Implements `max_marginal_relevance_search` and `max_marginal_relevance_search_by_vector` for the Momento Vector Index vectorstore. Additionally bumps the `momento` dependency in the lock file and adds logging to the implementation. **Dependencies** ✅ updates `momento` dependency in lock file **Tag maintainer** @baskaryan **Twitter handle** Please tag @momentohq for Momento Vector Index and @mloml for the contribution 🙇

delete code that could never be reached

- npm - search config - custom

Add [Text Embeddings by Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/models/text-embeddings/). It's a new integration. Trying to align it with its langchain-js version counterpart [here](https://api.js.langchain.com/classes/embeddings_cloudflare_workersai.CloudflareWorkersAIEmbeddings.html). - Dependencies: N/A - Done `make format` `make lint` `make spell_check` `make integration_tests` and all my changes was passed

Co-authored-by: stvhu-bookend <[email protected]>

@baskaryan

here it is validating shapely.geometry.point.Point: if not isinstance(data_frame[page_content_column].iloc[0], gpd.GeoSeries): raise ValueError( f"Expected data_frame[{page_content_column}] to be a GeoSeries" you need it to validate the geoSeries and not the shapely.geometry.point.Point if not isinstance(data_frame[page_content_column], gpd.GeoSeries): raise ValueError( f"Expected data_frame[{page_content_column}] to be a GeoSeries"

@hwchase17

Description: There's a copy-paste typo where on_llm_error() calls _on_chain_error() instead of _on_llm_error(). Issue: langchain-ai#13580 Dependencies: None Tag maintainer: @hwchase17 Twitter handle: @jwatte "Run `make format`, `make lint` and `make test` to check this locally." The test scripts don't work in a plain Ubuntu LTS 20.04 system. It looks like the dev container pulling is stuck. Or maybe the internet is just ornery today. --------- Co-authored-by: jwatte <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

…-ai#14823) ... variable, accompanied by a quote Co-authored-by: Yacine Bouakkaz <[email protected]>

[ScaNN](https://python.langchain.com/docs/integrations/providers/scann) and [DynamoDB](https://python.langchain.com/docs/integrations/platforms/aws#aws-dynamodb) pages in `providers` are redundant because we have those references in the Google and AWS platform pages. It is confusing. - I removed unnecessary pages, redirected files to new nams;

Corrected path reference from package/pirate-speak to packages/pirate-speak

Now that it's supported again for OAI chat models . Shame this wouldn't include it in the `.invoke()` output though (it's not included in the message itself). Would need to do a follow-up for that to be the case

…ndpoints` (langchain-ai#14827) This page doesn't exist: - https://python.langchain.com/docs/integrations/text_embeddings/nvidia_ai_endpoints but this one does: - https://python.langchain.com/docs/integrations/text_embedding/nvidia_ai_endpoints

Description: Fixes minor typo to documentation

**Description:** fixing a broken link to the extraction doc page

@vladkol

…xAIEmbeddings (langchain-ai#13999) - **Description:** VertexAIEmbeddings performance improvements - **Twitter handle:** @vladkol ## Improvements - Dynamic batch size, starting from 250, lowering down to 5. Batch size varies across regions. Some regions support larger batches, and it significantly improves performance. When running large batches of texts in `us-central1`, performance gain can be up to 3.5x. The dynamic batching also makes sure every batch is below 20K token limit. - New model parameter `embeddings_type` that translates to `task_type` parameter of the API. Newer model versions support [different embeddings task types](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#api_changes_to_models_released_on_or_after_august_2023).

Gpt-3.5 sometimes calls with empty string arguments instead of `{}` I'd assume it's because the typescript representation on their backend makes it a bit ambiguous.

…#14830) Tool outputs have to be strings apparently. Ensure they are formatted correctly before passing as intermediate steps. ``` BadRequestError: Error code: 400 - {'error': {'message': '1 validation error for Request\nbody -> tool_outputs -> 0 -> output\n str type expected (type=type_error.str)', 'type': 'invalid_request_error', 'param': None, 'code': None}} ```

<img width="1305" alt="Screenshot 2023-12-18 at 9 54 01 PM" src="https://github.com/langchain-ai/langchain/assets/10000925/c943fd81-cd48-46eb-8dff-4680424d9ba9"> The current model is no longer available.

- **Description:** Modification of descriptions for marketing purposes and transitioning towards `platforms` directory if possible. - **Issue:** Some marketing opportunities, lodging PR and awaiting later discussions. - This PR is intended to be merged when decisions settle/hopefully after further considerations. Submitting as Draft for now. Nobody @'d yet. --------- Co-authored-by: Bagatur <[email protected]>

dmitryrPlanner5D and others added 30 commits December 4, 2023 12:08

Add input_type override (langchain-ai#14068)

0f02081

Add option to override input_type for cohere's v3 embeddings models --------- Co-authored-by: Bagatur <[email protected]>

nbdoc -> quarto (langchain-ai#14156)

f6d68d7

Switches to a more maintained solution for building ipynb -> md files (`quarto`) Also bumps us down to python3.8 because it's significantly faster in the vercel build step. Uses default openssl version instead of upgrading as well.

docs[patch]: moved vectorstore notebook file (langchain-ai#14181)

1750cc4

The `/docs/integrations/toolkits/vectorstore` page is not the Integration page. The best place is in `/docs/modules/agents/how_to/` - Moved the file - Rerouted the page URL

core[patch]: add response kwarg to on_llm_error

aa8ae31

# Dependencies None # Twitter handle @HKydlicek --------- Co-authored-by: Erick Friis <[email protected]>

docs[patch]: fix columns (langchain-ai#14251)

f26d88c

Fix Sagemaker Endpoint documentation (langchain-ai#13660)

224aa51

- **Description:** fixed the transform_input method in the example., - **Issue:** example didn't work, - **Dependencies:** None, - **Tag maintainer:** @baskaryan, - **Twitter handle:** @Ravidhu87

Harrison/searchapi (langchain-ai#14252)

921c4b5

Co-authored-by: SebastjanPrachovskij <[email protected]>

Bugfix duckduckgo_search news search (langchain-ai#13670)

ee9abb6

- **Description:** Bugfix duckduckgo_search news search - **Issue:** langchain-ai#13648 - **Dependencies:** None - **Tag maintainer:** @baskaryan --------- Co-authored-by: Harrison Chase <[email protected]>

fake consistent embeddings cleanup (langchain-ai#14256)

4fb72ff

delete code that could never be reached

docs[patch]: search experiment (langchain-ai#14254)

4351b99

- npm - search config - custom

fix comet tracer (langchain-ai#14259)

c51001f

Harrison/bookend ai (langchain-ai#14258)

2213fc9

Co-authored-by: stvhu-bookend <[email protected]>

yacine555 and others added 22 commits December 17, 2023 16:41

docs: ensure consistency in declaring LANGCHAIN_API_KEY... (langchain…

2929509

…-ai#14823) ... variable, accompanied by a quote Co-authored-by: Yacine Bouakkaz <[email protected]>

docs: Typo in Templates README.md (langchain-ai#14812)

c316731

Corrected path reference from package/pirate-speak to packages/pirate-speak

community: Add logprobs in gen output (langchain-ai#14826)

2d91d2b

Now that it's supported again for OAI chat models . Shame this wouldn't include it in the `.invoke()` output though (it's not included in the message itself). Would need to do a follow-up for that to be the case

docs: typo in rag use case (langchain-ai#14800)

462321f

Description: Fixes minor typo to documentation

docs: Fix the broken link to Extraction page (langchain-ai#14806)

2e6a9e6

**Description:** fixing a broken link to the extraction doc page

Update parser (langchain-ai#14831)

bbc98a2

Gpt-3.5 sometimes calls with empty string arguments instead of `{}` I'd assume it's because the typescript representation on their backend makes it a bit ambiguous.

community[patch]: Update Tongyi default model_name (langchain-ai#14844)

5de1dc7

<img width="1305" alt="Screenshot 2023-12-18 at 9 54 01 PM" src="https://github.com/langchain-ai/langchain/assets/10000925/c943fd81-cd48-46eb-8dff-4680424d9ba9"> The current model is no longer available.

docs[patch]: gemini keywords (langchain-ai#14856)

9f851d8

docs[patch]: more keywords (langchain-ai#14858)

92957e6

community[patch]: Release 0.0.4 (langchain-ai#14864)

61ad0e8

langchain[patch]: Release 0.0.351 (langchain-ai#14867)

714bef0

add methods to deserialize prompts that were old (langchain-ai#14857)

193f107

Update agent_toolkits imports and repository URL

0a33fd7

Update tools and version in langchain library

50658d8

Version dump

ee0d373

Merge tag 'v0.0.351' into skypoint-v0.0.351

91a47f5

test

927d525

arunraja1 assigned Sandy247 and abhilashsharma1992 Jan 2, 2024

arunraja1 requested review from Sandy247 and abhilashsharma1992 January 2, 2024 12:10

arunraja1 assigned arunraja1 and unassigned Sandy247 and abhilashsharma1992 Jan 2, 2024

arunraja1 closed this Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skypoint v0.0.351 #5

Skypoint v0.0.351 #5

arunraja1 commented Jan 2, 2024 •

edited

Loading

Skypoint v0.0.351 #5

Skypoint v0.0.351 #5

Conversation

arunraja1 commented Jan 2, 2024 • edited Loading

arunraja1 commented Jan 2, 2024 •

edited

Loading