This repository provides LangChain components to connect your LangChain application with various Databricks services.
This package (langchain-databricks
) will soon be consolidated into a new package: databricks-langchain
. The new package will serve as the primary hub for all Databricks Langchain integrations.
In the coming months, databricks-langchain
will include all features currently in langchain-databricks
, as well as additional integrations to provide a unified experience for Databricks users.
For now, continue to use langchain-databricks
as usual. When databricks-langchain
is ready, we’ll provide clear migration instructions to make the transition seamless. During the transition period, langchain-databricks
will remain operational, and updates will be shared here with timelines and guidance.
Thank you for your support as we work toward an improved, streamlined experience!
- 🤖 LLMs: The
ChatDatabricks
component allows you to access chat endpoints hosted on Databricks Model Serving, including state-of-the-art models such as Llama3, Mixtral, and DBRX, as well as your own fine-tuned models. - 📐 Vector Store: Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.
- 🔢 Embeddings: Provides components for working with embedding models hosted on Databricks Model Serving.
- 📊 MLflow Integration: LangChain Databricks components are fully integrated with MLflow, providing various LLMOps capabilities such as experiment tracking, dependency management, evaluation, and tracing (observability).
Note: This repository will replace all Databricks integrations currently present in the langchain-community
package. Users are encouraged to migrate to this repository as soon as possible.
You can install the langchain-databricks
package from PyPI.
pip install langchain-databricks
Here's a simple example of how to use the langchain-databricks
package.
from langchain_databricks import ChatDatabricks
chat_model = ChatDatabricks(endpoint="databricks-meta-llama-3-70b-instruct")
response = chat_model.invoke("What is MLflow?")
print(response)
For more detailed usage examples and documentation, please refer to the LangChain documentation.
We welcome contributions to this project! Please follow the following guidance to setup the project for development and start contributing.
To contribute to this project, please follow the "fork and pull request" workflow. Please do not try to push directly to this repo unless you are a maintainer.
This project utilizes Poetry v1.7.1+ as a dependency manager.
❗Note: Before installing Poetry, if you use Conda
, create and activate a new Conda env (e.g. conda create -n langchain python=3.9
)
Install Poetry: documentation on how to install it.
❗Note: If you use Conda
or Pyenv
as your environment/package manager, after installing Poetry,
tell Poetry to use the virtualenv python environment (poetry config virtualenvs.prefer-active-python true
)
The project configuration and the makefile for running dev commands are located under the libs/databricks
directory.
cd libs/databricks
Install langchain-databricks development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
poetry install --with lint,typing,test,test_integration,dev
Then verify the installation.
make test
If during installation you receive a WheelFileValidationError
for debugpy
, please make sure you are running
Poetry v1.6.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
If you are still seeing this bug on v1.6.1+, you may also try disabling "modern installation"
(poetry config installer.modern-installation false
) and re-installing requirements.
See this debugpy
issue for more details.
Unit tests cover modular logic that does not require calls to outside APIs. If you add new logic, please add a unit test.
To run unit tests:
make test
Integration tests cover the end-to-end service calls as much as possible. However, in certain cases this might not be practical, so you can mock the service response for these tests. There are examples of this in the repo, that can help you write your own tests. If you have suggestions to improve this, please get in touch with us.
To run the integration tests:
make integration_test
Formatting ensures that the code in this repo has consistent style so that the code looks more presentable and readable. It corrects these errors when you run the formatting command. Linting finds and highlights the code errors and helps avoid coding practicies that can lead to errors.
Run both of these locally before submitting a PR. The CI scripts will run these when you submit a PR, and you won't be able to merge changes without fixing issues identified by the CI.
Formatting for this project is done via ruff.
To run format:
make format
Additionally, you can run the formatter only on the files that have been modified in your current branch
as compared to the master branch using the format_diff
command. This is especially useful when you have
made changes to a subset of the project and want to ensure your changes are properly formatted without
affecting the rest of the codebase.
make format_diff
Linting for this project is done via a combination of ruff and mypy.
To run lint:
make lint
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff
command. This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.
make lint_diff
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
Spellchecking for this project is done via codespell.
Note that codespell
finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
To check spelling for this project:
make spell_check
To fix spelling in place:
make spell_fix
If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the pyproject.toml
file.
[tool.codespell]
...
# Add here:
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
This project is licensed under the MIT License.