Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genesis: Add support for CrateDB #1

Merged
merged 38 commits into from
Dec 19, 2024
Merged

Genesis: Add support for CrateDB #1

merged 38 commits into from
Dec 19, 2024

Conversation

amotl
Copy link
Member

@amotl amotl commented Dec 13, 2024

@amotl amotl force-pushed the cratedb branch 11 times, most recently from 7ab026c to 2443e5f Compare December 14, 2024 21:25
@amotl amotl force-pushed the cratedb branch 4 times, most recently from fe4f5df to cca3956 Compare December 14, 2024 22:58
Import relevant test cases from `langchain-postgres`.

The translation machinery of `langchain-postgres` was based on native
SQL first, but then migrated to use special `jsonb_path_match` and
`jsonb_exists` functions of PostgreSQL, that CrateDB does not provide.

In order to overcome the dilemma, the
`CrateDBVectorStore._handle_field_filter` method has been inherited and
adjusted to do the right things for CrateDB.


1. Standard comparison operations $eq, $ne, $lt, $lte, $gt, $gte

op = COMPARISONS_TO_NATIVE[operator]
condition = self.EmbeddingStore.cmetadata[field].op(op)(filter_value)


2. $between

lower_bound = self.EmbeddingStore.cmetadata[field].op(">=")(low)
upper_bound = self.EmbeddingStore.cmetadata[field].op("<=")(high)
condition = sa.and_(lower_bound, upper_bound)


3. $exists

condition = sa.literal(field).op("=")(
  sa.func.any(sa.func.object_keys(self.EmbeddingStore.cmetadata)))
@amotl amotl force-pushed the cratedb branch 2 times, most recently from ccfcb76 to 0e77224 Compare December 15, 2024 23:47
The GHA workflows inherited by langchain-postgres invoke linters and
software tests only when needed, based on a diff of the source tree,
detecting if something has changed.

By extending the list of files of interest to include Python project
metadata files `pyproject.toml` and `poetry.lock`, this change ensures
that dependency updates submitted by Dependabot will also invoke the
software tests on CI.
Otherwise, scheduled runs would not use the most recent libraries
relative to their version constraints, so it would defer a fragment
of the continuous testing procedure.

Versions pinned in `poetry.lock` files will net have any meaning for
downstream users installing your package anyway, and, as such, are only
suitable for exact-pinning dependency versions of _applications_.

On the other hand, _libraries_ need to work with a wide range of
dependencies up and down, and are, as such, not suitable for applying
corresponding exact-pinning procedures.
dependabot bot added 2 commits December 16, 2024 01:36
Bumps [mypy](https://github.com/python/mypy) from 1.10.1 to 1.13.0.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](python/mypy@v1.10.1...v1.13.0)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.7 to 0.8.3.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](astral-sh/ruff@0.5.7...0.8.3)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@amotl amotl requested review from kneth and surister December 16, 2024 01:06
@amotl amotl marked this pull request as ready for review December 16, 2024 01:08
.github/workflows/_release.yml Show resolved Hide resolved
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2024 LangChain, Inc.
Copyright (c) 2024 Crate.io, Inc.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is MIT License a requirement from LangChain? Or can we use Apache 2.0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

langchain-cratedb is vendoring large portions of langchain-postgres and langchain-mongodb, mostly test cases.

The project uses the MIT license, like the langchain-postgres project it is deriving from.

-- https://github.com/crate/langchain-cratedb?tab=readme-ov-file#license

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with the MIT license

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated
### Contributing
The `langchain-cratedb` package is an open source project, and is
[managed on GitHub]. The project is still in its infancy, and
we appreciate contributions of any kind.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
we appreciate contributions of any kind.
We appreciate contributions of any kind.

README.md Show resolved Hide resolved
Add a few words about how to release the package, a missing line in the
changelog, designating the 0.0.0 release, and an improvement to the
README.
@amotl amotl requested a review from kneth December 18, 2024 15:06
Copy link

@kneth kneth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments have either been addresses or answered.

@amotl amotl merged commit c966e75 into main Dec 19, 2024
12 of 14 checks passed
@amotl amotl deleted the cratedb branch December 19, 2024 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants