forked from neo4j/neo4j-graphrag-python
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Restructure examples folder (neo4j#146)
* Structure proposal * Backup old examples in a specific folder (tmp) * WIP: example folder structure refactoring * ruff * Add result formatter example * LLM examples * MistralAILLM example + doc * Simple KG builder example * Embeder examples * Weaviate example * Fix import for cohere embeddings * Format * Update README with links to new files * Move Pinecone examples * Can't remove this file yet - but remove link to this specific file from doc - need to keep the file until the next release but then remove * Pinecone + cleaning * Cleaning 'old' folder * Components examples * Test and harmonize retriever section * Deal with qdrant examples - add custom component * Nicer path definition * Mypy/ruff * Rename answer -> QA + add links * Use pre_filters variable for explicitness * ruff * ruff * Missing files for db operations * Fix openai example * Fix CI * :'(
- Loading branch information
Showing
87 changed files
with
3,299 additions
and
888 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
# Examples Index | ||
|
||
This folder contains examples usage for the different features | ||
supported by the `neo4j-graphrag` package: | ||
|
||
- [Build Knowledge Graph](#build-knowledge-graph) from PDF or text | ||
- [Retrieve](#retrieve) information from the graph | ||
- [Question Answering](#answer-graphrag) (Q&A) | ||
|
||
Each of these steps have many customization options which | ||
are listed in [the last section of this file](#customize). | ||
|
||
## Build Knowledge Graph | ||
|
||
- [End to end PDF to graph simple pipeline](build_graph/simple_kg_builder_from_pdf.py) | ||
- [End to end text to graph simple pipeline](build_graph/simple_kg_builder_from_text.py) | ||
|
||
|
||
## Retrieve | ||
|
||
- [Retriever from an embedding vector](retrieve/similarity_search_for_vector.py) | ||
- [Retriever from a text](retrieve/similarity_search_for_text.py) | ||
- [Graph-based retrieval with VectorCypherRetriever](retrieve/vector_cypher_retriever.py) | ||
- [Hybrid retriever](./retrieve/hybrid_retriever.py) | ||
- [Hybrid Cypher retriever](./retrieve/hybrid_cypher_retriever.py) | ||
- [Text2Cypher retriever](./retrieve/text2cypher_search.py) | ||
|
||
|
||
### External Retrievers | ||
|
||
#### Weaviate | ||
|
||
- [Vector search](customize/retrievers/external/weaviate/weaviate_vector_search.py) | ||
- [Text search with local embeder](customize/retrievers/external/weaviate/weaviate_text_search_local_embedder.py) | ||
- [Text search with remote embeder](customize/retrievers/external/weaviate/weaviate_text_search_remote_embedder.py) | ||
|
||
#### Pinecone | ||
|
||
- [Vector search](./customize/retrievers/external/pinecone/pinecone_vector_search.py) | ||
- [Text search](./customize/retrievers/external/pinecone/pinecone_text_search.py) | ||
|
||
|
||
### Qdrant | ||
|
||
- [Vector search](./customize/retrievers/external/qdrant/qdrant_vector_search.py) | ||
- [Text search](./customize/retrievers/external/qdrant/qdrant_text_search.py) | ||
|
||
|
||
## Answer: GraphRAG | ||
|
||
- [End to end GraphRAG](./answer/graphrag.py) | ||
|
||
|
||
## Customize | ||
|
||
### Retriever | ||
|
||
- [Control result format for VectorRetriever](customize/retrievers/result_formatter_vector_retriever.py) | ||
- [Control result format for VectorCypherRetriever](customize/retrievers/result_formatter_vector_cypher_retriever.py) | ||
|
||
|
||
### LLMs | ||
|
||
- [OpenAI (GPT)](./customize/llms/openai_llm.py) | ||
- [Azure OpenAI]() | ||
- [VertexAI (Gemini)](./customize/llms/vertexai_llm.py) | ||
- [MistralAI](./customize/llms/mistalai_llm.py) | ||
- [Cohere](./customize/llms/cohere_llm.py) | ||
- [Anthropic (Claude)](./customize/llms/anthropic_llm.py) | ||
- [Ollama]() | ||
- [Custom LLM](./customize/llms/custom_llm.py) | ||
|
||
|
||
### Prompts | ||
|
||
- [Using a custom prompt](old/graphrag_custom_prompt.py) | ||
|
||
|
||
### Embedders | ||
|
||
- [OpenAI](./customize/embeddings/openai_embeddings.py) | ||
- [Azure OpenAI](./customize/embeddings/azure_openai_embeddings.py) | ||
- [VertexAI](./customize/embeddings/vertexai_embeddings.py) | ||
- [MistralAI](./customize/embeddings/mistalai_embeddings.py) | ||
- [Cohere](./customize/embeddings/cohere_embeddings.py) | ||
- [Ollama](./customize/embeddings/ollama_embeddings.py) | ||
- [Custom LLM](./customize/embeddings/custom_embeddings.py) | ||
|
||
|
||
### KG Construction - Pipeline | ||
|
||
- [End to end example with explicit components and text input](./customize/build_graph/pipeline/kg_builder_from_text.py) | ||
- [End to end example with explicit components and PDF input](./customize/build_graph/pipeline/kg_builder_from_pdf.py) | ||
|
||
#### Components | ||
|
||
- Loaders: | ||
- [Load PDF file](./customize/build_graph/components/loaders/pdf_loader.py) | ||
- [Custom](./customize/build_graph/components/loaders/custom_loader.py) | ||
- Text Splitter: | ||
- [Fixed size splitter](./customize/build_graph/components/splitters/fixed_size_splitter.py) | ||
- [Splitter from LangChain](./customize/build_graph/components/splitters/langhchain_splitter.py) | ||
- [Splitter from LLamaIndex](./customize/build_graph/components/splitters/llamaindex_splitter.py) | ||
- [Custom](./customize/build_graph/components/splitters/custom_splitter.py) | ||
- [Chunk embedder]() | ||
- Schema Builder: | ||
- [User-defined](./customize/build_graph/components/schema_builders/schema.py) | ||
- Entity Relation Extractor: | ||
- [LLM-based](./customize/build_graph/components/extractors/llm_entity_relation_extractor.py) | ||
- [LLM-based with custom prompt](./customize/build_graph/components/extractors/llm_entity_relation_extractor_with_custom_prompt.py) | ||
- [Custom](./customize/build_graph/components/extractors/custom_extractor.py) | ||
- Knowledge Graph Writer: | ||
- [Neo4j writer](./customize/build_graph/components/writers/neo4j_writer.py) | ||
- [Custom](./customize/build_graph/components/writers/custom_writer.py) | ||
- Entity Resolver: | ||
- [SinglePropertyExactMatchResolver](./customize/build_graph/components/resolvers/simple_entity_resolver.py) | ||
- [SinglePropertyExactMatchResolver with pre-filter](./customize/build_graph/components/resolvers/simple_entity_resolver_pre_filter.py) | ||
- [Custom resolver](./customize/build_graph/components/resolvers/custom_resolver.py) | ||
- [Custom component](./customize/build_graph/components/custom_component.py) | ||
|
||
|
||
### Answer: GraphRAG | ||
|
||
- [LangChain compatibility](./customize/answer/langchain_compatiblity.py) | ||
- [Use a custom prompt](./customize/answer/custom_prompt.py) | ||
|
||
|
||
## Database Operations | ||
|
||
- [Create vector index](database_operations/create_vector_index.py) | ||
- [Create full text index](create_fulltext_index.py) | ||
- [Populate vector index](populate_vector_index.py) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
"""This example illustrates how to get started easily with the SimpleKGPipeline | ||
and ingest PDF into a Neo4j Knowledge Graph. | ||
This example assumes a Neo4j db is up and running. Update the credentials below | ||
if needed. | ||
OPENAI_API_KEY needs to be in the env vars. | ||
""" | ||
|
||
import asyncio | ||
from pathlib import Path | ||
|
||
import neo4j | ||
from neo4j_graphrag.embeddings import OpenAIEmbeddings | ||
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline | ||
from neo4j_graphrag.experimental.pipeline.pipeline import PipelineResult | ||
from neo4j_graphrag.llm import LLMInterface | ||
from neo4j_graphrag.llm.openai_llm import OpenAILLM | ||
|
||
# Neo4j db infos | ||
URI = "neo4j://localhost:7687" | ||
AUTH = ("neo4j", "password") | ||
DATABASE = "neo4j" | ||
|
||
|
||
root_dir = Path(__file__).parents[4] | ||
file_path = root_dir / "data" / "Harry Potter and the Chamber of Secrets Summary.pdf" | ||
|
||
|
||
# Instantiate Entity and Relation objects. This defines the | ||
# entities and relations the LLM will be looking for in the text. | ||
ENTITIES = ["Person", "Organization", "Location"] | ||
RELATIONS = ["SITUATED_AT", "INTERACTS", "LED_BY"] | ||
POTENTIAL_SCHEMA = [ | ||
("Person", "SITUATED_AT", "Location"), | ||
("Person", "INTERACTS", "Person"), | ||
("Organization", "LED_BY", "Person"), | ||
] | ||
|
||
|
||
async def define_and_run_pipeline( | ||
neo4j_driver: neo4j.Driver, | ||
llm: LLMInterface, | ||
) -> PipelineResult: | ||
# Create an instance of the SimpleKGPipeline | ||
kg_builder = SimpleKGPipeline( | ||
llm=llm, | ||
driver=neo4j_driver, | ||
embedder=OpenAIEmbeddings(), | ||
entities=ENTITIES, | ||
relations=RELATIONS, | ||
potential_schema=POTENTIAL_SCHEMA, | ||
) | ||
return await kg_builder.run_async(file_path=str(file_path)) | ||
|
||
|
||
async def main() -> PipelineResult: | ||
llm = OpenAILLM( | ||
model_name="gpt-4o", | ||
model_params={ | ||
"max_tokens": 2000, | ||
"response_format": {"type": "json_object"}, | ||
}, | ||
) | ||
with neo4j.GraphDatabase.driver(URI, auth=AUTH, database=DATABASE) as driver: | ||
res = await define_and_run_pipeline(driver, llm) | ||
await llm.async_client.close() | ||
return res | ||
|
||
|
||
if __name__ == "__main__": | ||
res = asyncio.run(main()) | ||
print(res) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
"""This example illustrates how to get started easily with the SimpleKGPipeline | ||
and ingest text into a Neo4j Knowledge Graph. | ||
This example assumes a Neo4j db is up and running. Update the credentials below | ||
if needed. | ||
""" | ||
|
||
import asyncio | ||
|
||
import neo4j | ||
from neo4j_graphrag.embeddings import OpenAIEmbeddings | ||
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline | ||
from neo4j_graphrag.experimental.pipeline.pipeline import PipelineResult | ||
from neo4j_graphrag.llm import LLMInterface | ||
from neo4j_graphrag.llm.openai_llm import OpenAILLM | ||
|
||
# Neo4j db infos | ||
URI = "neo4j://localhost:7687" | ||
AUTH = ("neo4j", "password") | ||
DATABASE = "neo4j" | ||
|
||
# Text to process | ||
TEXT = """The son of Duke Leto Atreides and the Lady Jessica, Paul is the heir of House Atreides, | ||
an aristocratic family that rules the planet Caladan.""" | ||
|
||
# Instantiate Entity and Relation objects. This defines the | ||
# entities and relations the LLM will be looking for in the text. | ||
ENTITIES = ["Person", "House", "Planet"] | ||
RELATIONS = ["PARENT_OF", "HEIR_OF", "RULES"] | ||
POTENTIAL_SCHEMA = [ | ||
("Person", "PARENT_OF", "Person"), | ||
("Person", "HEIR_OF", "House"), | ||
("House", "RULES", "Planet"), | ||
] | ||
|
||
|
||
async def define_and_run_pipeline( | ||
neo4j_driver: neo4j.Driver, | ||
llm: LLMInterface, | ||
) -> PipelineResult: | ||
# Create an instance of the SimpleKGPipeline | ||
kg_builder = SimpleKGPipeline( | ||
llm=llm, | ||
driver=neo4j_driver, | ||
embedder=OpenAIEmbeddings(), | ||
entities=ENTITIES, | ||
relations=RELATIONS, | ||
potential_schema=POTENTIAL_SCHEMA, | ||
from_pdf=False, | ||
) | ||
return await kg_builder.run_async(text=TEXT) | ||
|
||
|
||
async def main() -> PipelineResult: | ||
llm = OpenAILLM( | ||
model_name="gpt-4o", | ||
model_params={ | ||
"max_tokens": 2000, | ||
"response_format": {"type": "json_object"}, | ||
}, | ||
) | ||
with neo4j.GraphDatabase.driver(URI, auth=AUTH, database=DATABASE) as driver: | ||
res = await define_and_run_pipeline(driver, llm) | ||
await llm.async_client.close() | ||
return res | ||
|
||
|
||
if __name__ == "__main__": | ||
res = asyncio.run(main()) | ||
print(res) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.