Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: chroma error patch(attribute changed on chroma) #27827

Merged
merged 35 commits into from
Nov 5, 2024

Conversation

shjunn
Copy link
Contributor

@shjunn shjunn commented Nov 1, 2024

There was a change of attribute name which was "max_batch_size". It's now "get_max_batch_size" method.
I want to use "create_batches" which is right down below.

Please check this PR link.
reference: chroma-core/chroma#2305

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Nov 1, 2024
Copy link

vercel bot commented Nov 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Nov 5, 2024 7:41pm

@dosubot dosubot bot added community Related to langchain-community Ɑ: vector store Related to vector store module 🤖:nit Small modifications/deletions, fixes, deps or improvements to existing code or docs labels Nov 1, 2024
Copy link
Member

@efriis efriis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unfortunately a breaking change caused by a breaking change in their client

should this just be an OR case to make it work for both?

@efriis efriis self-assigned this Nov 4, 2024
@shjunn
Copy link
Contributor Author

shjunn commented Nov 4, 2024

OR case can be a safe way but their client changed the attribute to method because of chroma-core/chroma#2403 (Preventing a max_batch_size not to be set.)
Once they did it, I believe they will not revert the name.
If this code could be another bug, I respect your suggestion and I'll edit this code.
Please let me know

@efriis
Copy link
Member

efriis commented Nov 4, 2024

old versions of chroma that have this function are still available, and we don't want to break those folks' code

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Nov 4, 2024
@shjunn
Copy link
Contributor Author

shjunn commented Nov 4, 2024

Thanks for letting me know.

prithvikannan and others added 15 commits November 5, 2024 08:53
Thank you for contributing to LangChain!

Update references in Databricks integration page to reference our new
partner package databricks-langchain
https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Prithvi Kannan <[email protected]>
…ontain commas in it (langchain-ai#26365)

- **Description:**
Currently CommaSeparatedListOutputParser can't handle strings that may
contain commas within a column. It would parse any commas as the
delimiter.
Ex. 
"foo, foo2", "bar", "baz"

It will create 4 columns: "foo", "foo2", "bar", "baz"

This should be 3 columns:

"foo, foo2", "bar", "baz"

- **Dependencies:**
Added 2 additional imports, but they are built in python packages.

import csv
from io import StringIO

- **Twitter handle:** @jkyamog

- [ ] **Add tests and docs**: 
1. added simple unit test test_multiple_items_with_comma

---------

Co-authored-by: Erick Friis <[email protected]>
Co-authored-by: Bagatur <[email protected]>
Co-authored-by: Bagatur <[email protected]>
…gchain-ai#27872)

**Description:** 
This PR addresses an issue in the CSVLoader example where data is not
defined, causing a NameError. The line `data = loader.load()` is added
to correctly assign the output of loader.load() to the data variable.
…n-ai#26208)

I added one more 'elif' to read tool call message from `tool_calls`

---------

Co-authored-by: Chester Curme <[email protected]>
…` to fix validation error (langchain-ai#26308)

### Description:
This PR sets a default value of `output_token_limit = 4000` for the
`PowerBIToolkit` to fix the unintentionally validation error.

### Problem:
When attempting to run a code snippet from [Langchain's PowerBI toolkit
documentation](https://python.langchain.com/v0.1/docs/integrations/toolkits/powerbi/)
to interact with a `PowerBIDataset`, the following error occurs:

```
pydantic.v1.error_wrappers.ValidationError: 1 validation error for QueryPowerBITool
output_token_limit
  none is not an allowed value (type=type_error.none.not_allowed)
```

### Root Cause:
The issue arises because when creating a `QueryPowerBITool`, the
`output_token_limit` parameter is unintentionally set to `None`, which
is the current default for `PowerBIToolkit`. However, `QueryPowerBITool`
expects a default value of `4000` for `output_token_limit`. This
unintended override causes the error.


https://github.com/langchain-ai/langchain/blob/17659ca2cdc418436edf2f8c7f50e800dbbf31ca/libs/community/langchain_community/agent_toolkits/powerbi/toolkit.py#L63

https://github.com/langchain-ai/langchain/blob/17659ca2cdc418436edf2f8c7f50e800dbbf31ca/libs/community/langchain_community/agent_toolkits/powerbi/toolkit.py#L72-L79

https://github.com/langchain-ai/langchain/blob/17659ca2cdc418436edf2f8c7f50e800dbbf31ca/libs/community/langchain_community/tools/powerbi/tool.py#L39

### Solution:
To resolve this, the default value of `output_token_limit` is now
explicitly set to `4000` in `PowerBIToolkit` to prevent the accidental
assignment of `None`.

Co-authored-by: ccurme <[email protected]>
Documentation: Add NVIDIA as integration provider

cc: @mattf @dglogo

Co-authored-by: Erick Friis <[email protected]>
sifatj and others added 14 commits November 5, 2024 08:53
…i#27838)

**Description**: Update outdated `VectorStore` api reference urls in
`retrievers.ipynb`

Co-authored-by: Erick Friis <[email protected]>
…#27841)

**Description**: Update VectorStore api reference url in `rag.ipynb`

Co-authored-by: Erick Friis <[email protected]>
Description:

This fixes an issue that mistakenly created in
langchain-ai#27253. The issue
currently exists only in `langchain-community==0.3.4`.

Test cases were added to prevent this issue in the future.

Co-authored-by: Erick Friis <[email protected]>
…iever.ipynb (langchain-ai#27842)

**Description**: Update VectorStore `.as_retriever` method url in
`vectorstore_retriever.ipynb`

Co-authored-by: Erick Friis <[email protected]>
…_vector.ipynb (langchain-ai#27843)

**Description**: Update VectorStore `max_marginal_relevance_search` api
reference url in `multi_vector.ipynb`

Co-authored-by: Erick Friis <[email protected]>
…ow_to.ipynb (langchain-ai#27844)

**Description**: Update VectorStore `as_retriever` method api reference
url in `qa_chat_history_how_to.ipynb`

Co-authored-by: Erick Friis <[email protected]>
### Description
Updates phrasing for the header of the `Messages` section.

Co-authored-by: Erick Friis <[email protected]>
Thank you for contributing to LangChain!

- **Description:** Updated Vectara integration
- **Issue:** refresh on descriptions across all demos and added UDF
reranker
- **Dependencies:** None
- **Twitter handle:** @ofermend

---------

Co-authored-by: Bagatur <[email protected]>
Co-authored-by: Erick Friis <[email protected]>
- **Description:** change to do the batch embedding server side and not
client side
- **Twitter handle:** @wildagsx

---------

Co-authored-by: ccurme <[email protected]>
@shjunn shjunn force-pushed the community_chroma_patch branch from 69de5ea to b2266db Compare November 4, 2024 23:54
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Nov 4, 2024
@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Nov 4, 2024
@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Nov 5, 2024
@efriis efriis enabled auto-merge (squash) November 5, 2024 19:41
@efriis efriis merged commit f6b2f82 into langchain-ai:master Nov 5, 2024
29 checks passed
yanomaly pushed a commit to yanomaly/langchain that referenced this pull request Nov 8, 2024
…-ai#27827)

There was a change of attribute name which was "max_batch_size". It's
now "get_max_batch_size" method.
I want to use "create_batches" which is right down below.

Please check this PR link.
reference: chroma-core/chroma#2305

---------

Signed-off-by: Prithvi Kannan <[email protected]>
Co-authored-by: Prithvi Kannan <[email protected]>
Co-authored-by: Bagatur <[email protected]>
Co-authored-by: Erick Friis <[email protected]>
Co-authored-by: Jun Yamog <[email protected]>
Co-authored-by: Bagatur <[email protected]>
Co-authored-by: ono-hiroki <[email protected]>
Co-authored-by: Dobiichi-Origami <[email protected]>
Co-authored-by: Chester Curme <[email protected]>
Co-authored-by: Duy Huynh <[email protected]>
Co-authored-by: Rashmi Pawar <[email protected]>
Co-authored-by: sifatj <[email protected]>
Co-authored-by: Eric Pinzur <[email protected]>
Co-authored-by: Daniel Vu Dao <[email protected]>
Co-authored-by: Ofer Mendelevitch <[email protected]>
Co-authored-by: Stéphane Philippart <[email protected]>
efriis pushed a commit that referenced this pull request Nov 20, 2024
Description:
* I'm planning to add `Document.id` support to the Chroma VectorStore,
but first I wanted to make sure all the integration tests were passing
first. They weren't. This PR fixes the broken tests.
* I found 2 issues:
* This change (from a year ago, exactly :) ) for supporting multi-modal
embeddings:
https://docs.trychroma.com/deployment/migration#migration-to-0.4.16---november-7,-2023
* This change #27827 due
to an update in the chroma client.
  
Also ran `format` and `lint` on the changes.

Note: I am not a member of the Chroma team.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community lgtm PR looks good. Use to confirm that a PR is ready for merging. 🤖:nit Small modifications/deletions, fixes, deps or improvements to existing code or docs size:S This PR changes 10-29 lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.