forked from microsoft/graphrag
-
Notifications
You must be signed in to change notification settings - Fork 27
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'microsoft:main' into main
- Loading branch information
Showing
24 changed files
with
186 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"type": "patch", | ||
"description": "add visualization guide to doc site" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"type": "patch", | ||
"description": "fix streaming output error" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"type": "patch", | ||
"description": "Allow some cicd jobs to skip PRs dedicated to doc updates only." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"type": "patch", | ||
"description": "Fix a file paths issue in the viz guide." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"type": "patch", | ||
"description": "Fix optional covariates update in incremental indexing" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"type": "patch", | ||
"description": "Raise error on empty deltas for inc indexing" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,35 +1,34 @@ | ||
<div class="grid cards" markdown> | ||
|
||
- [:octicons-arrow-right-24: **GraphRAG: Unlocking LLM discovery on narrative private data**](https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/) | ||
|
||
*** | ||
|
||
<h6>Published February 13, 2024 | ||
<div class="grid cards" markdown> | ||
|
||
By [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager</h6> | ||
- [:octicons-arrow-right-24: __GraphRAG: Unlocking LLM discovery on narrative private data__](https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/) | ||
|
||
- [:octicons-arrow-right-24: **GraphRAG: New tool for complex data discovery now on GitHub**](https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/) | ||
--- | ||
<h6>Published February 13, 2024 | ||
|
||
*** | ||
By [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager</h6> | ||
|
||
|
||
<h6>Published July 2, 2024 | ||
- [:octicons-arrow-right-24: __GraphRAG: New tool for complex data discovery now on GitHub__](https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/) | ||
|
||
By [Darren Edge](https://www.microsoft.com/en-us/research/people/daedge/), Senior Director; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect</h6> | ||
--- | ||
<h6>Published July 2, 2024 | ||
|
||
- [:octicons-arrow-right-24: **GraphRAG auto-tuning provides rapid adaptation to new domains**](https://www.microsoft.com/en-us/research/blog/graphrag-auto-tuning-provides-rapid-adaptation-to-new-domains/) | ||
By [Darren Edge](https://www.microsoft.com/en-us/research/people/daedge/), Senior Director; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect</h6> | ||
|
||
*** | ||
|
||
<h6>Published September 9, 2024 | ||
- [:octicons-arrow-right-24: __GraphRAG auto-tuning provides rapid adaptation to new domains__](https://www.microsoft.com/en-us/research/blog/graphrag-auto-tuning-provides-rapid-adaptation-to-new-domains/) | ||
|
||
By [Alonso Guevara Fernández](https://www.microsoft.com/en-us/research/people/alonsog/), Sr. Software Engineer; Katy Smith, Data Scientist II; [Joshua Bradley](https://www.microsoft.com/en-us/research/people/joshbradley/), Senior Data Scientist; [Darren Edge](https://www.microsoft.com/en-us/research/people/daedge/), Senior Director; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; [Sarah Smith](https://www.microsoft.com/en-us/research/people/smithsarah/), Senior Program Manager; [Ben Cutler](https://www.microsoft.com/en-us/research/people/bcutler/), Senior Director; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect | ||
--- | ||
<h6>Published September 9, 2024 | ||
|
||
- [:octicons-arrow-right-24: **Introducing DRIFT Search: Combining global and local search methods to improve quality and efficiency**](https://www.microsoft.com/en-us/research/blog/introducing-drift-search-combining-global-and-local-search-methods-to-improve-quality-and-efficiency/) | ||
By [Alonso Guevara Fernández](https://www.microsoft.com/en-us/research/people/alonsog/), Sr. Software Engineer; Katy Smith, Data Scientist II; [Joshua Bradley](https://www.microsoft.com/en-us/research/people/joshbradley/), Senior Data Scientist; [Darren Edge](https://www.microsoft.com/en-us/research/people/daedge/), Senior Director; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; [Sarah Smith](https://www.microsoft.com/en-us/research/people/smithsarah/), Senior Program Manager; [Ben Cutler](https://www.microsoft.com/en-us/research/people/bcutler/), Senior Director; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect</h6> | ||
|
||
*** | ||
- [:octicons-arrow-right-24: __Introducing DRIFT Search: Combining global and local search methods to improve quality and efficiency__](https://www.microsoft.com/en-us/research/blog/introducing-drift-search-combining-global-and-local-search-methods-to-improve-quality-and-efficiency/) | ||
|
||
<h6>Published October 31, 2024 | ||
--- | ||
<h6>Published October 31, 2024 | ||
|
||
By Julian Whiting , Senior Machine Learning Engineer; Zachary Hills , Senior Software Engineer; [Alonso Guevara Fernández](https://www.microsoft.com/en-us/research/people/alonsog/), Sr. Software Engineer; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; Adam Bradley , Managing Partner, Strategic Research; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect | ||
By Julian Whiting, Senior Machine Learning Engineer; Zachary Hills , Senior Software Engineer; [Alonso Guevara Fernández](https://www.microsoft.com/en-us/research/people/alonsog/), Sr. Software Engineer; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; Adam Bradley , Managing Partner, Strategic Research; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect</h6> | ||
|
||
</div> | ||
</div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
# Visualizing and Debugging Your Knowledge Graph | ||
|
||
The following step-by-step guide walks through the process to visualize a knowledge graph after it's been constructed by graphrag. Note that some of the settings recommended below are based on our own experience of what works well. Feel free to change and explore other settings for a better visualization experience! | ||
|
||
## 1. Run the Pipeline | ||
Before building an index, please review your `settings.yaml` configuration file and ensure that graphml snapshots is enabled. | ||
```yaml | ||
snapshots: | ||
graphml: true | ||
``` | ||
(Optional) To support other visualization tools and exploration, additional parameters can be enabled that provide access to vector embeddings. | ||
```yaml | ||
embed_graph: | ||
enabled: true # will generate node2vec embeddings for nodes | ||
umap: | ||
enabled: true # will generate UMAP embeddings for nodes | ||
``` | ||
After running the indexing pipeline over your data, there will be an output folder (defined by the `storage.base_dir` setting). | ||
|
||
- **Output Folder**: Contains artifacts from the LLM’s indexing pass. | ||
|
||
## 2. Locate the Knowledge Graph | ||
In the output folder, look for a file named `merged_graph.graphml`. graphml is a standard [file format](http://graphml.graphdrawing.org) supported by many visualization tools. We recommend trying [Gephi](https://gephi.org). | ||
|
||
## 3. Open the Graph in Gephi | ||
1. Install and open Gephi | ||
2. Navigate to the `output` folder containing the various parquet files. | ||
3. Import the `merged_graph.graphml` file into Gephi. This will result in a fairly plain view of the undirected graph nodes and edges. | ||
|
||
<p align="center"> | ||
<img src="../img/viz_guide/gephi-initial-graph-example.png" alt="A basic graph visualization by Gephi" width="300"/> | ||
</p> | ||
|
||
## 4. Install the Leiden Algorithm Plugin | ||
1. Go to `Tools` -> `Plugins`. | ||
2. Search for "Leiden Algorithm". | ||
3. Click `Install` and restart Gephi. | ||
|
||
## 5. Run Statistics | ||
1. In the `Statistics` tab on the right, click `Run` for `Average Degree` and `Leiden Algorithm`. | ||
|
||
<p align="center"> | ||
<img src="../img/viz_guide/gephi-network-overview-settings.png" alt="A view of Gephi's network overview settings" width="300"/> | ||
</p> | ||
|
||
2. For the Leiden Algorithm, adjust the settings: | ||
- **Quality function**: Modularity | ||
- **Resolution**: 1 | ||
|
||
## 6. Color the Graph by Clusters | ||
1. Go to the `Appearance` pane in the upper left side of Gephi. | ||
|
||
<p align="center"> | ||
<img src="../img/viz_guide/gephi-appearance-pane.png" alt="A view of Gephi's appearance pane" width="500"/> | ||
</p> | ||
|
||
2. Select `Nodes`, then `Partition`, and click the color palette icon in the upper right. | ||
3. Choose `Cluster` from the dropdown. | ||
4. Click the `Palette...` hyperlink, then `Generate...`. | ||
5. Uncheck `Limit number of colors`, click `Generate`, and then `Ok`. | ||
6. Click `Apply` to color the graph. This will color the graph based on the partitions discovered by Leiden. | ||
|
||
## 7. Resize Nodes by Degree Centrality | ||
1. In the `Appearance` pane in the upper left, select `Nodes` -> `Ranking` | ||
2. Select the `Sizing` icon in the upper right. | ||
2. Choose `Degree` and set: | ||
- **Min**: 10 | ||
- **Max**: 150 | ||
3. Click `Apply`. | ||
|
||
## 8. Layout the Graph | ||
1. In the `Layout` tab in the lower left, select `OpenORD`. | ||
|
||
<p align="center"> | ||
<img src="../img/viz_guide/gephi-layout-pane.png" alt="A view of Gephi's layout pane" width="400"/> | ||
</p> | ||
|
||
2. Set `Liquid` and `Expansion` stages to 50, and everything else to 0. | ||
3. Click `Run` and monitor the progress. | ||
|
||
## 9. Run ForceAtlas2 | ||
1. Select `Force Atlas 2` in the layout options. | ||
|
||
<p align="center"> | ||
<img src="../img/viz_guide/gephi-layout-forceatlas2-pane.png" alt="A view of Gephi's ForceAtlas2 layout pane" width="400"/> | ||
</p> | ||
|
||
2. Adjust the settings: | ||
- **Scaling**: 15 | ||
- **Dissuade Hubs**: checked | ||
- **LinLog mode**: uncheck | ||
- **Prevent Overlap**: checked | ||
3. Click `Run` and wait. | ||
4. Press `Stop` when it looks like the graph nodes have settled and no longer change position significantly. | ||
|
||
## 10. Add Text Labels (Optional) | ||
1. Turn on text labels in the appropriate section. | ||
2. Configure and resize them as needed. | ||
|
||
Your final graph should now be visually organized and ready for analysis! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters