-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V3 graph implmentations #1593
V3 graph implmentations #1593
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just needs a few tweaks. Would like to take another look tomorrow before we merge
js/sdk/src/r2rClient.ts
Outdated
*/ | ||
@feature("getTriples") | ||
async getTriples( | ||
@feature("getRelationships") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the SDK methods here to getTriples, as not to break any current implementation. We can also add an @deprecated
warning to this such that it warns users to switch to the v3 SDK methods
py/cli/commands/v2/kg.py
Outdated
@@ -229,7 +229,7 @@ async def get_entities( | |||
@click.option( | |||
"--collection-id", | |||
required=True, | |||
help="Collection ID to retrieve triples from.", | |||
help="Collection ID to retrieve relationships from.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the SDK, let's leave the CLI as is. I have a deprecation wrapper that we can add to these methods, and then we will point users to the new command.
WrappedKGTunePromptResponse, | ||
) | ||
|
||
|
||
from shared.api.models.kg.responses_v3 import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to get ride of the validation on the v2 endpoints—it's there for legacy, and should work, but, no need to have two versions of this for legacy code.
@@ -308,9 +309,9 @@ async def get_triples( | |||
description="Number of items to return. Use -1 to return all items.", | |||
), | |||
auth_user=Depends(self.service.providers.auth.auth_wrapper), | |||
) -> WrappedKGTriplesResponse: | |||
) -> WrappedKGRelationshipsResponse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should modify these to WrappedRelationshipResponse
and WrappedRelationshipsResponse
to match the rest of the codebase
py/core/main/api/v3/graph_router.py
Outdated
@@ -1,22 +1,33 @@ | |||
import logging | |||
import textwrap | |||
from typing import Optional | |||
from typing import Optional, Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but just use |
@@ -110,7 +110,7 @@ async def local_search( | |||
|
|||
# entity search | |||
search_type = "__Entity__" | |||
async for search_result in await self.database_provider.vector_query( # type: ignore | |||
async for search_result in await self.database_provider.graph_search( # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but lets stop passing kwargs and move to args whenever possible
vector_column_str = _decorate_vector_type( | ||
f"({self.dimension})", self.quantization_type | ||
) | ||
|
||
query = f""" | ||
CREATE TABLE IF NOT EXISTS {self._get_table_name("chunk_entity")} ( | ||
id SERIAL PRIMARY KEY, | ||
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), | ||
sid SERIAL NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make sure this gets in the migration script
py/core/providers/database/kg.py
Outdated
if entity_level is None: | ||
raise ValueError("Entity level is not set") | ||
|
||
for entity in entities: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to a generator?
if not all(entity.level == entity_level for entity in entities):
raise ValueError("All entities must be of the same level")
py/sdk/v2/kg.py
Outdated
@@ -103,39 +103,39 @@ async def get_entities( | |||
|
|||
return await self._make_request("GET", "entities", params=params) # type: ignore | |||
|
|||
async def get_triples( | |||
async def get_relationships( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep v2 method names, it's fine to change the internal naming conventions, though
py/shared/api/models/kg/responses.py
Outdated
WrappedKGTriplesResponse = ResultsWrapper[KGTriplesResponse] | ||
|
||
# KG Entities | ||
WrappedKGEntityResponse = ResultsWrapper[KGEntitiesResponse] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but it should be singular KGEntitiesResponse
and so on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 1cb66cf in 2 minutes and 51 seconds
More details
- Looked at
10379
lines of code in57
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. py/core/main/api/v2/kg_router.py:312
- Draft comment:
The parametertriple_ids
should be renamed torelationship_ids
to maintain consistency with the rest of the code changes.
relationship_ids: Optional[list[str]] = Query(
- Reason this comment was not posted:
Marked as duplicate.
2. py/tests/core/providers/kg/test_kg_logic.py:87
- Draft comment:
Consider renamingtriples_raw_list
torelationships_raw_list
for consistency with the updated terminology. - Reason this comment was not posted:
Confidence changes required:50%
The PR changes the term 'triples' to 'relationships' across the codebase. However, in the test file, the variable name 'triples_raw_list' should also be updated to 'relationships_raw_list' for consistency.
Workflow ID: wflow_aL6oV2GYDiYT7UXk
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
@@ -308,9 +311,9 @@ async def get_triples( | |||
description="Number of items to return. Use -1 to return all items.", | |||
), | |||
auth_user=Depends(self.service.providers.auth.auth_wrapper), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter triple_ids
should be renamed to relationship_ids
to maintain consistency with the rest of the code changes.
auth_user=Depends(self.service.providers.auth.auth_wrapper), | |
relationship_ids: Optional[list[str]] = Query( |
* improve ci/cd runtime * update prompt tests * improve ci/cd runtime (#1535) * improve ci/cd runtime * update prompt tests * Support Python ^3.10 (#1534) * add azure * up * up * spec out v3 api * checkin document router * adding chunk abstr * add list chunks * add chunk search * up * add users routes * up * checkin progress * add collections annotations * add indices * add user * checkin work * up * complete conversations CRUD * fix type errors * add graph router * add graphs * Update JS (#1563) * Feature/add graph to v3 (#1565) * complete simple tests, cleanup routers * up * Harmonize Pagination across endpoints (#1564) * Pagination * Add fixmes * Fix nested deletion filter bug (#1567) * Remove Mintlify docs (#1569) * Nolan/list collection (#1568) * Check in * More * Fix nested transactions issue in sqlite logger * Fix update collection return type * JS V3 (#1571) * Sync collections JS * More documents * Clean up messy code * list not List * Users first pass * User tests and fixmes * More * typo * More prompts * Pre-commit improvements * Remove prints * Cleanups on conversations * Branches response * Chunks * More work on the return types * Jest config * Fix branch creation time * Fix lock * Nolan/v3 tests (#1578) * Add deprecated command back * Add warning * Fix GraphRAG tests (#1579) * More cleanup (#1580) * More cleanup * More * Fix test * More cleanups * More cleanups * More * Merge main * Python SDK V3 (#1585) * Python SDK V3 * Fix * First pass (#1586) * More V3 (#1587) * Validation errors * Update js test * more * Fix sync methods on v2 sdk, add check for download files (#1588) * More CLI (#1589) * Print logs on failing tests (#1590) * Print logs on failing tests * MOre * cleanup * Again * Again * More JS testing (#1591) * More JS testing * Cleanup * More refactors for tests (#1592) * System Routes (#1594) * Fix type errors, pass collection id (#1595) * Hotfix: dict * V3 graph implmentations (#1593) * complete simple tests, cleanup routers * up * up * checkin * up * up * response models * checkin * up * checkin * up * up * up * up * up * up * v2 * up * up * up * up --------- Co-authored-by: emrgnt-cmplxty <[email protected]> * Allow passing of collection id at document ingestion (#1596) * KG Response sync (#1597) * fix * Fix Prompt Override (#1599) * Fix Prompt Override * print * Caching * Fix * Updated Graph Models, Drop SID (#1598) * New Graph Models * Fix * minor tweaks * fix summary model (#1604) * incr progress * Add /users/me (#1605) * Add /users/me * oops * Resolve Merge Conflicts (#1607) * Fix conflicts * Clean up * Nolan/conflicts (#1608) * expose reset data to admin (#1602) * up (#1603) * up * up * wtf github is a piece of garbage --------- Co-authored-by: emrgnt-cmplxty <[email protected]> * wrapup walkthrough * Add delete user method, sync JS to camel case (#1609) * V3 graph testing (#1606) * up * up * up * graph crud * up * community endpts * up * up * up * up * up * up * up * up * add back routers * up * pre-commit * Fix Broken V2 Graphs, Better Response Models (#1612) * Increase test coverage * Fix v2 graphs, better response models * Remaining types * Add types to Python SDK * Typo * update tests * revert test change * up * Add types to package export (#1613) * Graph refactor (#1611) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * Create branch update (#1617) * Graph refactor (#1616) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * List collections (#1619) * up * up * up * up * Graph refactor (#1620) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * up * up * up * up * Nolan/update graph (#1621) * List collections * Update Graph JS SDK * up * up * cleanup * Graph refactor (#1622) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * up * up * up * up * up * up * cleanup * up * up * up * remove unnecessary functions * up * up * complete document embedding workflow * working get command on graph * checkin progress * up * add entity and relationship deletions * no verif * up * up * up * up (#1636) * up * sync graph * up * up * fix relationship distance calc. * fix issue with faulty collection filter (#1637) * Patch/alternative fix logics 2 (#1638) * fix issue with faulty collection filter * further refinements, like fixing limits * up * fix logic around include metadata and scores * fix double collection assignment * up * fix communities * working clusters * up * add collection extraction * add collection extraction * up * prep for merge * Patch/alternative up with nolan (#1643) * SDK First pass * Add feature tracking * Typo * Check in * Rebase * Add Graph tests * Fix Agent empty message bug * Check in JS routes * More tests, examples * Sync python * Expose Entity/Relationship Params in Routes (#1640) * Expose Entity/Relationship Params * Descriptions * Modify create entities * Create relationships * set parent_id * Update entitiy * Update Relationships * Check in * Ellipsis fixes * More cleanup * Start CRUD on communities * Communities DB * Explicit working path * Once again * Fail fast false * Testing around community creation * Delete community test * Update community tests * Clean up type errors, cleaner code * More cleanup * More * remove chunk_entity * Delete bad, unused methods * More * fixup crud * rm pull --------- Co-authored-by: NolanTrem <[email protected]> * Feature/fix graph permissions (#1645) * update docs / collections * up * Feature/fix auth checks (#1647) * update docs / collections * up * fix super user and more * up * up (#1648) * Feature/rm v2 api (#1649) * SDK First pass * Add feature tracking * Typo * Check in * Rebase * Add Graph tests * Fix Agent empty message bug * Check in JS routes * More tests, examples * Sync python * Expose Entity/Relationship Params in Routes (#1640) * Expose Entity/Relationship Params * Descriptions * Modify create entities * Create relationships * set parent_id * Update entitiy * Update Relationships * Check in * Ellipsis fixes * More cleanup * Start CRUD on communities * Communities DB * Explicit working path * Once again * Fail fast false * Testing around community creation * Delete community test * Update community tests * Clean up type errors, cleaner code * More cleanup * More * remove chunk_entity * Delete bad, unused methods * More * remove v2 api * rm kg router * cleanups * fixup delete by filter * fixup delete by filter * fixes * up * up --------- Co-authored-by: NolanTrem <[email protected]> * Improved Data Structures (#1650) * Check in * Most tests fixed * fix tables * Once more * Move to a single community table * Don't modify existing migration script--keep them atomic * Migration * Migration, more clean up * All but deletion working * Up * Feature/tweaks for prod (#1651) * tweaks for prod * up * final tweaks * Nolan/deletion (#1652) * Check in * Most tests fixed * fix tables * Once more * Move to a single community table * Don't modify existing migration script--keep them atomic * Migration * Migration, more clean up * All but deletion working * Up * Fix deletion * Working migration (#1654) * Feature/production tweaks (#1653) * tweaks for prod * up * final tweaks * prod tweaks * fixed --------- Co-authored-by: NolanTrem <[email protected]> * sort --------- Co-authored-by: Nolan Tremelling <[email protected]> Co-authored-by: Shreyas Pimpalgaonkar <[email protected]>
Important
Refactor and enhance knowledge graph functionality by renaming triples to relationships, updating models, handlers, and API endpoints, and adjusting tests accordingly.
getTriples
togetRelationships
inr2rV2ClientIntegrationSuperUser.test.ts
andr2rV2ClientIntegrationUser.test.ts
.KGCreationSettings
inmodels.tsx
to usekg_relationships_extraction_prompt
andmax_knowledge_relationships
.get_triples
toget_relationships
inkg.py
.kg_router.py
andgraph_router.py
to handle relationships instead of triples.Triple
withRelationship
in__init__.py
and other related files.CommunityReport
toCommunity
inabstractions/__init__.py
.PostgresRelationshipHandler
andPostgresEntityHandler
inkg.py
.KGTriplesExtractionPipe
toKGRelationshipsExtractionPipe
inpipes/__init__.py
.community_summary.py
anddeduplication.py
to handle relationships.graphrag_relationships_extraction_few_shot.yaml
to reflect relationship extraction changes.test_kg_community_summary_pipe.py
andtest_kg_logic.py
to align with new relationship handling.This description was created by for 1cb66cf. It will automatically update as commits are pushed.