Skip to content

Commit

Permalink
upgrade chroma to 0.4.0 (langchain-ai#7749)
Browse files Browse the repository at this point in the history
** This should land Monday the 17th ** 

Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to
build, more durable, faster, smaller, and more extensible. This comes
with a few changes:

1. A simplified and improved client setup. Instead of having to remember
weird settings, users can just do `EphemeralClient`, `PersistentClient`
or `HttpClient` (the underlying direct `Client` implementation is also
still accessible)

2. We migrated data stores away from `duckdb` and `clickhouse`. This
changes the api for the `PersistentClient` that used to reference
`chroma_db_impl="duckdb+parquet"`. Now we simply set
`is_persistent=true`. `is_persistent` is set for you to `true` if you
use `PersistentClient`.

3. Because we migrated away from `duckdb` and `clickhouse` - this also
means that users need to migrate their data into the new layout and
schema. Chroma is committed to providing extension notification and
tooling around any schema and data migrations (for example - this PR!).

After upgrading to `0.4.0` - if users try to access their data that was
stored in the previous regime, the system will throw an `Exception` and
instruct them how to use the migration assistant to migrate their data.
The migration assitant is a pip installable CLI: `pip install
chroma_migrate`. And is runnable by calling `chroma_migrate`

-- TODO ADD here is a short video demonstrating how it works. 

Please reference the readme at
[chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate)
to see a full write-up of our philosophy on migrations as well as more
details about this particular migration.

Please direct any users facing issues upgrading to our Discord channel
called
[#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883).
We have also created a [email
listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers
directly in the future about breaking changes.

---------

Co-authored-by: Bagatur <[email protected]>
  • Loading branch information
jeffchuber and baskaryan authored Jul 19, 2023
1 parent 1024637 commit 2139d01
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 12 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
"\n",
"# Instantiate 2 diff cromadb indexs, each one with a diff embedding.\n",
"client_settings = chromadb.config.Settings(\n",
" chroma_db_impl=\"duckdb+parquet\",\n",
" is_persistent=True,\n",
" persist_directory=DB_DIR,\n",
" anonymized_telemetry=False,\n",
")\n",
Expand Down
46 changes: 36 additions & 10 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ deeplake = "^3.6.8"
libdeeplake = "^0.0.60"
weaviate-client = "^3.15.5"
torch = "^1.0.0"
chromadb = "^0.3.21"
chromadb = "^0.4.0"
tiktoken = "^0.3.3"
python-dotenv = "^1.0.0"
sentence-transformers = "^2"
Expand Down

0 comments on commit 2139d01

Please sign in to comment.