-
Notifications
You must be signed in to change notification settings - Fork 126
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Pinecone Document Store - minimal implementation (#81)
* Add PineconeDocumentStore * adapt to Document refactoring * start improving existing tests * try to setup a testing workflow * fix some format errors * adapt to new strucure * adapt pyproject; rm about * fix workflow * add hatch-vcs * simplification - first draft * simplified tests * make workflow read the api key * rm score when filtering docs * increase wait time * improve api key reading; more tests * improvements from PR review * test simplification * test simplification 2 * fix * std ds tests want valueerror * put tests together * format * add fallback for namespace in _embedding_retrieval * try to parallelize tests * better try * labeler * format fix * Apply suggestions from code review Co-authored-by: Massimiliano Pippi <[email protected]> * Revert "Apply suggestions from code review" This reverts commit f42c540. * improve document conversion * rm deepcopy * missing return * fix fmt * copy metadata * fmt * mv comment * improve tests * readmes --------- Co-authored-by: vrunm <[email protected]> Co-authored-by: Massimiliano Pippi <[email protected]>
- Loading branch information
1 parent
e7d79e7
commit fbdb9a0
Showing
11 changed files
with
722 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# This workflow comes from https://github.com/ofek/hatch-mypyc | ||
# https://github.com/ofek/hatch-mypyc/blob/5a198c0ba8660494d02716cfc9d79ce4adfb1442/.github/workflows/test.yml | ||
name: Test / pinecone | ||
|
||
on: | ||
schedule: | ||
- cron: "0 0 * * *" | ||
pull_request: | ||
paths: | ||
- "integrations/pinecone/**" | ||
- ".github/workflows/pinecone.yml" | ||
|
||
concurrency: | ||
group: pinecone-${{ github.head_ref }} | ||
cancel-in-progress: true | ||
|
||
env: | ||
PYTHONUNBUFFERED: "1" | ||
FORCE_COLOR: "1" | ||
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }} | ||
|
||
jobs: | ||
run: | ||
name: Python ${{ matrix.python-version }} on ${{ startsWith(matrix.os, 'macos-') && 'macOS' || startsWith(matrix.os, 'windows-') && 'Windows' || 'Linux' }} | ||
runs-on: ${{ matrix.os }} | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
# Pinecone tests are time expensive, so the matrix is limited to Python 3.9 and 3.10 | ||
os: [ubuntu-latest] | ||
python-version: ["3.9", "3.10"] | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- name: Install Hatch | ||
run: pip install --upgrade hatch | ||
|
||
- name: Lint | ||
working-directory: integrations/pinecone | ||
if: matrix.python-version == '3.9' | ||
run: hatch run lint:all | ||
|
||
- name: Run tests | ||
working-directory: integrations/pinecone | ||
run: hatch run cov |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
[![test](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/pinecone.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/pinecone.yml) | ||
|
||
[![PyPI - Version](https://img.shields.io/pypi/v/pinecone-haystack.svg)](https://pypi.org/project/pinecone-haystack) | ||
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pinecone-haystack.svg)](https://pypi.org/project/pinecone-haystack) | ||
|
||
# Pinecone Document Store | ||
|
||
Document Store for Haystack 2.x, supports Pinecone. | ||
|
||
## Installation | ||
|
||
```console | ||
pip install pinecone-haystack | ||
``` | ||
|
||
## Testing | ||
|
||
```console | ||
hatch run test | ||
``` | ||
|
||
## License | ||
|
||
`pinecone-haystack` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,186 @@ | ||
[build-system] | ||
requires = ["hatchling", "hatch-vcs"] | ||
build-backend = "hatchling.build" | ||
|
||
[project] | ||
name = "pinecone_haystack" | ||
dynamic = ["version"] | ||
description = '' | ||
readme = "README.md" | ||
requires-python = ">=3.8" | ||
license = "Apache-2.0" | ||
keywords = [] | ||
authors = [ | ||
{ name = "deepset GmbH", email = "[email protected]" }, | ||
] | ||
classifiers = [ | ||
"Development Status :: 4 - Beta", | ||
"Programming Language :: Python", | ||
"Programming Language :: Python :: 3.8", | ||
"Programming Language :: Python :: 3.9", | ||
"Programming Language :: Python :: 3.10", | ||
"Programming Language :: Python :: 3.11", | ||
"Programming Language :: Python :: Implementation :: CPython", | ||
"Programming Language :: Python :: Implementation :: PyPy", | ||
] | ||
dependencies = [ | ||
"haystack-ai", | ||
"pinecone-client", | ||
] | ||
|
||
[project.urls] | ||
Documentation = "https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/pinecone#readme" | ||
Issues = "https://github.com/deepset-ai/haystack-core-integrations/issues" | ||
Source = "https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/pinecone" | ||
|
||
[tool.hatch.version] | ||
source = "vcs" | ||
tag-pattern = 'integrations\/pinecone-v(?P<version>.*)' | ||
|
||
[tool.hatch.version.raw-options] | ||
root = "../.." | ||
git_describe_command = 'git describe --tags --match="integrations/pinecone-v[0-9]*"' | ||
|
||
[tool.hatch.envs.default] | ||
dependencies = [ | ||
"coverage[toml]>=6.5", | ||
"pytest", | ||
"pytest-xdist", | ||
] | ||
[tool.hatch.envs.default.scripts] | ||
# Pinecone tests are slow (require HTTP requests), so we run them in parallel | ||
# with pytest-xdist (https://pytest-xdist.readthedocs.io/en/stable/distribution.html) | ||
test = "pytest -n auto --maxprocesses=3 {args:tests}" | ||
test-cov = "coverage run -m pytest -n auto --maxprocesses=3 {args:tests}" | ||
cov-report = [ | ||
"- coverage combine", | ||
"coverage report", | ||
] | ||
cov = [ | ||
"test-cov", | ||
"cov-report", | ||
] | ||
|
||
[[tool.hatch.envs.all.matrix]] | ||
python = ["3.8", "3.9", "3.10", "3.11"] | ||
|
||
[tool.hatch.envs.lint] | ||
detached = true | ||
dependencies = [ | ||
"black>=23.1.0", | ||
"mypy>=1.0.0", | ||
"ruff>=0.0.243", | ||
"numpy", | ||
] | ||
[tool.hatch.envs.lint.scripts] | ||
typing = "mypy --install-types --non-interactive {args:src/pinecone_haystack tests}" | ||
style = [ | ||
"ruff {args:.}", | ||
"black --check --diff {args:.}", | ||
] | ||
fmt = [ | ||
"black {args:.}", | ||
"ruff --fix {args:.}", | ||
"style", | ||
] | ||
all = [ | ||
"style", | ||
"typing", | ||
] | ||
|
||
[tool.hatch.metadata] | ||
allow-direct-references = true | ||
|
||
[tool.black] | ||
target-version = ["py37"] | ||
line-length = 120 | ||
skip-string-normalization = true | ||
|
||
[tool.ruff] | ||
target-version = "py37" | ||
line-length = 120 | ||
select = [ | ||
"A", | ||
"ARG", | ||
"B", | ||
"C", | ||
"DTZ", | ||
"E", | ||
"EM", | ||
"F", | ||
"FBT", | ||
"I", | ||
"ICN", | ||
"ISC", | ||
"N", | ||
"PLC", | ||
"PLE", | ||
"PLR", | ||
"PLW", | ||
"Q", | ||
"RUF", | ||
"S", | ||
"T", | ||
"TID", | ||
"UP", | ||
"W", | ||
"YTT", | ||
] | ||
ignore = [ | ||
# Allow non-abstract empty methods in abstract base classes | ||
"B027", | ||
# Allow boolean positional values in function calls, like `dict.get(... True)` | ||
"FBT003", | ||
# Ignore checks for possible passwords | ||
"S105", "S106", "S107", | ||
# Ignore complexity | ||
"C901", "PLR0911", "PLR0912", "PLR0913", "PLR0915", | ||
] | ||
unfixable = [ | ||
# Don't touch unused imports | ||
"F401", | ||
] | ||
|
||
[tool.ruff.isort] | ||
known-first-party = ["pinecone_haystack"] | ||
|
||
[tool.ruff.flake8-tidy-imports] | ||
ban-relative-imports = "all" | ||
|
||
[tool.ruff.per-file-ignores] | ||
# Tests can use magic values, assertions, and relative imports | ||
"tests/**/*" = ["PLR2004", "S101", "TID252"] | ||
|
||
[tool.coverage.run] | ||
source_pkgs = ["pinecone_haystack", "tests"] | ||
branch = true | ||
parallel = true | ||
omit = [ | ||
"example" | ||
] | ||
|
||
[tool.coverage.paths] | ||
pinecone_haystack = ["src/pinecone_haystack", "*/pinecone_haystack/src/pinecone_haystack"] | ||
tests = ["tests", "*/pinecone_haystack/tests"] | ||
|
||
[tool.coverage.report] | ||
exclude_lines = [ | ||
"no cov", | ||
"if __name__ == .__main__.:", | ||
"if TYPE_CHECKING:", | ||
] | ||
|
||
[tool.pytest.ini_options] | ||
minversion = "6.0" | ||
markers = [ | ||
"unit: unit tests", | ||
"integration: integration tests" | ||
] | ||
|
||
[[tool.mypy.overrides]] | ||
module = [ | ||
"pinecone.*", | ||
"haystack.*", | ||
"pytest.*" | ||
] | ||
ignore_missing_imports = true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# SPDX-FileCopyrightText: 2023-present deepset GmbH <[email protected]> | ||
# | ||
# SPDX-License-Identifier: Apache-2.0 | ||
from pinecone_haystack.document_store import PineconeDocumentStore | ||
|
||
__all__ = ["PineconeDocumentStore"] |
Oops, something went wrong.