-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INTPYTHON-330 GraphRAG #66
INTPYTHON-330 GraphRAG #66
Conversation
…hain_mongodb/graphrag
…s of all entities in one call to
…icates in traversal
…ted tests and prompts.
…ovide tenplate for user-provided examples
…it for better UX. Added allowed_entity_types to name extraction from query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a first pass, reviewing everything but the implementation details of the MongoDBGraphStore. I mostly have cosmetic feedback. Excellent work!
from importlib.metadata import version | ||
from typing import Any, Dict, List, Optional, Union | ||
|
||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to put this behind an if TYPE_CHECKING
to avoid a runtime dependency on typing_extensions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok no problem. What's the best practice? How does one tell what must be placed behind TYPE_CHECKING
and what doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that is only needed for typing could be put under TYPE_CHECKING. Anything that would result in a new runtime dep must be put under TYPE_CHECKING.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what's required?
if TYPE_CHECKING:
Entity: TypeAlias = Dict[str, Any]
"""Represents an Entity in the knowledge graph with specific schema. See .schema"""
GraphRAG is a ChatModel that provides responses to semantic queries | ||
based on a Knowledge Graph that an LLM is used to create. | ||
As in Vector RAG, we augment the Chat Model's training data | ||
with relevant information that we collect from documents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with relevant information that we collect from documents. | |
with relevant information that we collect from documents. |
In Graph RAG, one uses an "Entity-Extraction" model that converts | ||
text into Entities and their relationships, a Knowledge Graph. | ||
Comparison is done by Graph traversal, finding entities connected | ||
to the query prompts. These are then supplied to the Chat Model as context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to the query prompts. These are then supplied to the Chat Model as context. | |
to the query prompts. These are then supplied to the Chat Model as context. |
Args: | ||
documents: list of textual documents and associated metadata. | ||
Returns: | ||
List containing metadata on entities inserted and updated, one value for each input document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List containing metadata on entities inserted and updated, one value for each input document | |
List containing metadata on entities inserted and updated, one value for each input document. |
def find_entity_by_name(self, name: str) -> Optional[Entity]: | ||
"""Utility to get Entity dict from Knowledge Graph / Collection. | ||
Args: | ||
name: _id string to look for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: _id string to look for | |
name: _id string to look for. |
Args: | ||
name: _id string to look for | ||
Returns: | ||
List of Entity dicts if any match name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List of Entity dicts if any match name | |
List of Entity dicts if any match name. |
Args: | ||
starting_entities: Traversal begins with documents whose _id fields match these strings. | ||
max_depth: Recursion continues until no more matching documents are found, | ||
or until the operation reaches a recursion depth specified by the maxDepth parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be indented to render properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the line that continues after max_depth
, I think it needs to be indented further
Args: | ||
query: Prompt before it is augmented by Knowledge Graph. | ||
chat_model: ChatBot. Defaults to entity_extraction_model. | ||
prompt: Alternative Prompt Template. Defaults to prompts.rag_prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prompt: Alternative Prompt Template. Defaults to prompts.rag_prompt | |
prompt: Alternative Prompt Template. Defaults to prompts.rag_prompt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The indentation is necessary.
Co-authored-by: Steven Silvester <[email protected]>
|
||
from langchain_mongodb.graphrag import example_templates, prompts | ||
|
||
from .prompts import rag_prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please make all of these constants UPPERCASE? That is the convention used by the sql_toolkit and I find it easier to tell what is a constant and what is a user input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the string names to UPPERCase. By the time, they are imported here, they aren't constants, though. They've been wrapped. It's a good idea. Draws attention to itself.
Args: | ||
starting_entities: Traversal begins with documents whose _id fields match these strings. | ||
max_depth: Recursion continues until no more matching documents are found, | ||
or until the operation reaches a recursion depth specified by the maxDepth parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the line that continues after max_depth
, I think it needs to be indented further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Adds MongoDBGraphStore
"""
GraphRAG is a ChatModel that provides responses to semantic queries.
As in Vector RAG, we augment the Chat Model's training data
with relevant information that we collect from documents.
"""