Skip to content

Commit

Permalink
astradb: bootstrapping Astra DB as Partner Package (#16875)
Browse files Browse the repository at this point in the history
**Description:** This PR introduces a new "Astra DB" Partner Package.

So far only the vector store class is _duplicated_ there, all others
following once this is validated and established.

Along with the move to separate package, incidentally, the class name
will change `AstraDB` => `AstraDBVectorStore`.

The strategy has been to duplicate the module (with prospected removal
from community at LangChain 0.2). Until then, the code will be kept in
sync with minimal, known differences (there is a makefile target to
automate drift control. Out of convenience with this check, the
community package has a class `AstraDBVectorStore` aliased to `AstraDB`
at the end of the module).

With this PR several bugfixes and improvement come to the vector store,
as well as a reshuffling of the doc pages/notebooks (Astra and
Cassandra) to align with the move to a separate package.

**Dependencies:** A brand new pyproject.toml in the new package, no
changes otherwise.

**Twitter handle:** `@rsprrs`

---------

Co-authored-by: Christophe Bornet <[email protected]>
Co-authored-by: Erick Friis <[email protected]>
  • Loading branch information
3 people authored Feb 15, 2024
1 parent f6f0ca1 commit 5240eca
Show file tree
Hide file tree
Showing 33 changed files with 4,621 additions and 447 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/_integration_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ jobs:
WATSONX_PROJECT_ID: ${{ secrets.WATSONX_PROJECT_ID }}
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
PINECONE_ENVIRONMENT: ${{ secrets.PINECONE_ENVIRONMENT }}
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_KEYSPACE: ${{ secrets.ASTRA_DB_KEYSPACE }}
run: |
make integration_tests
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,9 @@ jobs:
WATSONX_PROJECT_ID: ${{ secrets.WATSONX_PROJECT_ID }}
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
PINECONE_ENVIRONMENT: ${{ secrets.PINECONE_ENVIRONMENT }}
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_KEYSPACE: ${{ secrets.ASTRA_DB_KEYSPACE }}
run: make integration_tests
working-directory: ${{ inputs.working-directory }}

Expand Down
113 changes: 74 additions & 39 deletions docs/docs/integrations/document_loaders/cassandra.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -72,57 +72,72 @@
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Init from a cassandra driver Session\n",
"\n",
"You need to create a `cassandra.cluster.Session` object, as described in the [Cassandra driver documentation](https://docs.datastax.com/en/developer/python-driver/latest/api/cassandra/cluster/#module-cassandra.cluster). The details vary (e.g. with network settings and authentication), but this might be something like:"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"from cassandra.cluster import Cluster\n",
"\n",
"cluster = Cluster()\n",
"session = cluster.connect()"
],
"metadata": {
"collapsed": false
},
"execution_count": null
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"You need to provide the name of an existing keyspace of the Cassandra instance:"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"CASSANDRA_KEYSPACE = input(\"CASSANDRA_KEYSPACE = \")"
],
"metadata": {
"collapsed": false
},
"execution_count": null
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"Creating the document loader:"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
Expand All @@ -144,18 +159,21 @@
},
{
"cell_type": "code",
"outputs": [],
"source": [
"docs = loader.load()"
],
"execution_count": 17,
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-01-19T15:47:26.399472Z",
"start_time": "2024-01-19T15:47:26.389145Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"execution_count": 17
"outputs": [],
"source": [
"docs = loader.load()"
]
},
{
"cell_type": "code",
Expand All @@ -169,7 +187,9 @@
"outputs": [
{
"data": {
"text/plain": "Document(page_content='Row(_id=\\'659bdffa16cbc4586b11a423\\', title=\\'Dangerous Men\\', reviewtext=\\'\"Dangerous Men,\" the picture\\\\\\'s production notes inform, took 26 years to reach the big screen. After having seen it, I wonder: What was the rush?\\')', metadata={'table': 'movie_reviews', 'keyspace': 'default_keyspace'})"
"text/plain": [
"Document(page_content='Row(_id=\\'659bdffa16cbc4586b11a423\\', title=\\'Dangerous Men\\', reviewtext=\\'\"Dangerous Men,\" the picture\\\\\\'s production notes inform, took 26 years to reach the big screen. After having seen it, I wonder: What was the rush?\\')', metadata={'table': 'movie_reviews', 'keyspace': 'default_keyspace'})"
]
},
"execution_count": 19,
"metadata": {},
Expand All @@ -182,17 +202,27 @@
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Init from cassio\n",
"\n",
"It's also possible to use cassio to configure the session and keyspace."
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"import cassio\n",
Expand All @@ -204,11 +234,16 @@
")\n",
"\n",
"docs = loader.load()"
],
"metadata": {
"collapsed": false
},
"execution_count": null
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Attribution statement\n",
"\n",
"> Apache Cassandra, Cassandra and Apache are either registered trademarks or trademarks of the [Apache Software Foundation](http://www.apache.org/) in the United States and/or other countries."
]
}
],
"metadata": {
Expand All @@ -233,7 +268,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.9.17"
}
},
"nbformat": 4,
Expand Down
12 changes: 11 additions & 1 deletion docs/docs/integrations/llms/llm_caching.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1131,6 +1131,16 @@
"print(llm(\"How come we always see one face of the moon?\"))"
]
},
{
"cell_type": "markdown",
"id": "55dc84b3-37cb-4f19-b175-40e18e06f83f",
"metadata": {},
"source": [
"#### Attribution statement\n",
"\n",
">Apache Cassandra, Cassandra and Apache are either registered trademarks or trademarks of the [Apache Software Foundation](http://www.apache.org/) in the United States and/or other countries."
]
},
{
"cell_type": "markdown",
"id": "8712f8fc-bb89-4164-beb9-c672778bbd91",
Expand Down Expand Up @@ -1588,7 +1598,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
"version": "3.9.17"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet \"astrapy>=0.6.2\""
"%pip install --upgrade --quiet \"astrapy>=0.7.1\""
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,24 @@
"source": [
"message_history.messages"
]
},
{
"cell_type": "markdown",
"id": "59902d0f-e9ba-4e3d-a7e0-ce202b9d3c43",
"metadata": {},
"source": [
"#### Attribution statement\n",
"\n",
"> Apache Cassandra, Cassandra and Apache are either registered trademarks or trademarks of the [Apache Software Foundation](http://www.apache.org/) in the United States and/or other countries."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7efaa51c-e9ee-4dce-80a4-eb9280a0dbe5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -163,7 +181,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.9.17"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 5240eca

Please sign in to comment.