-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Genesis: Add support for CrateDB #1
Conversation
7ab026c
to
2443e5f
Compare
Re-scaffold from `langchain-cli@master`, because LANGCHAIN-28580 combined sync/async vector store standard test suites, which apparently did not get released, yet.
fe4f5df
to
cca3956
Compare
Import relevant test cases from `langchain-postgres`. The translation machinery of `langchain-postgres` was based on native SQL first, but then migrated to use special `jsonb_path_match` and `jsonb_exists` functions of PostgreSQL, that CrateDB does not provide. In order to overcome the dilemma, the `CrateDBVectorStore._handle_field_filter` method has been inherited and adjusted to do the right things for CrateDB. 1. Standard comparison operations $eq, $ne, $lt, $lte, $gt, $gte op = COMPARISONS_TO_NATIVE[operator] condition = self.EmbeddingStore.cmetadata[field].op(op)(filter_value) 2. $between lower_bound = self.EmbeddingStore.cmetadata[field].op(">=")(low) upper_bound = self.EmbeddingStore.cmetadata[field].op("<=")(high) condition = sa.and_(lower_bound, upper_bound) 3. $exists condition = sa.literal(field).op("=")( sa.func.any(sa.func.object_keys(self.EmbeddingStore.cmetadata)))
ccfcb76
to
0e77224
Compare
The GHA workflows inherited by langchain-postgres invoke linters and software tests only when needed, based on a diff of the source tree, detecting if something has changed. By extending the list of files of interest to include Python project metadata files `pyproject.toml` and `poetry.lock`, this change ensures that dependency updates submitted by Dependabot will also invoke the software tests on CI.
Otherwise, scheduled runs would not use the most recent libraries relative to their version constraints, so it would defer a fragment of the continuous testing procedure. Versions pinned in `poetry.lock` files will net have any meaning for downstream users installing your package anyway, and, as such, are only suitable for exact-pinning dependency versions of _applications_. On the other hand, _libraries_ need to work with a wide range of dependencies up and down, and are, as such, not suitable for applying corresponding exact-pinning procedures.
Bumps [mypy](https://github.com/python/mypy) from 1.10.1 to 1.13.0. - [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md) - [Commits](python/mypy@v1.10.1...v1.13.0) --- updated-dependencies: - dependency-name: mypy dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.7 to 0.8.3. - [Release notes](https://github.com/astral-sh/ruff/releases) - [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md) - [Commits](astral-sh/ruff@0.5.7...0.8.3) --- updated-dependencies: - dependency-name: ruff dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
@@ -1,6 +1,6 @@ | |||
MIT License | |||
|
|||
Copyright (c) 2024 LangChain, Inc. | |||
Copyright (c) 2024 Crate.io, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is MIT License a requirement from LangChain? Or can we use Apache 2.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
langchain-cratedb is vendoring large portions of langchain-postgres and langchain-mongodb, mostly test cases.
The project uses the MIT license, like the langchain-postgres project it is deriving from.
-- https://github.com/crate/langchain-cratedb?tab=readme-ov-file#license
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with the MIT license
README.md
Outdated
### Contributing | ||
The `langchain-cratedb` package is an open source project, and is | ||
[managed on GitHub]. The project is still in its infancy, and | ||
we appreciate contributions of any kind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we appreciate contributions of any kind. | |
We appreciate contributions of any kind. |
Add a few words about how to release the package, a missing line in the changelog, designating the 0.0.0 release, and an improvement to the README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments have either been addresses or answered.
About
Provide adapter for CrateDB to support LangChain's Vector Store, Document Loader, and Conversational Memory subsystems.
Details
This patch bundles a few others heading for upstream LangChain, but did not land.
References