You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues and this bug is not already filed.
My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.
Describe the issue
{
"type": "error",
"data": "Error running pipeline!",
"stack": "Traceback (most recent call last):\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/run/run_workflows.py", line 166, in _run_workflows\n result = await run_workflow(\n ^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/workflows/extract_graph.py", line 45, in run_workflow\n base_entity_nodes, base_relationship_edges = await extract_graph(\n ^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/flows/extract_graph.py", line 33, in extract_graph\n entities, relationships = await extract_entities(\n ^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/operations/extract_entities/extract_entities.py", line 137, in extract_entities\n relationships = _merge_relationships(relationship_dfs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/operations/extract_entities/extract_entities.py", line 178, in _merge_relationships\n .agg(\n ^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/groupby/generic.py", line 1432, in aggregate\n result = op.agg()\n ^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 190, in agg\n return self.agg_dict_like()\n ^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 423, in agg_dict_like\n return self.agg_or_apply_dict_like(op_name="agg")\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 1608, in agg_or_apply_dict_like\n result_index, result_data = self.compute_dict_like(\n ^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 462, in compute_dict_like\n func = self.normalize_dictlike_arg(op_name, selected_obj, func)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 663, in normalize_dictlike_arg\n raise KeyError(f"Column(s) {list(cols)} do not exist")\nKeyError: "Column(s) ['description', 'source_id', 'weight'] do not exist"\n",
"source": ""Column(s) ['description', 'source_id', 'weight'] do not exist"",
"details": null
}
Steps to reproduce
No response
GraphRAG Config Used
### This config file contains required core defaults that must be set, along with a handful of common optional settings.### For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/### LLM settings ##### There are a number of settings to tune the threading and token limits for LLM calls - check the docs.encoding_model: cl100k_base # this needs to be matched to your model!llm:
api_key: lm-studio # set this in the generated .env filetype: openai_chat # or azure_openai_chat#model: deepseek-r1:32b#max_tokens: 4000model_supports_json: false # recommended if this is available for your model.#model: deepseek-r1#api_base: http://192.168.2.131:11434/v1/# audience: "https://cognitiveservices.azure.com/.default"#model: deepseek-r1-distill-qwen-7b#api_base: http://192.168.2.131:1234/v1/model: Qwen/Qwen2.5-1.5B-Instructapi_base: http://192.168.2.131:30000/v1/# api_version: 2024-02-15-preview# organization: <organization_id># deployment_name: <azure_model_deployment_name>parallelization:
stagger: 0.3num_threads: 50async_mode: threaded # or asyncioembeddings:
async_mode: threaded # or asynciovector_store:
type: lancedb # one of [lancedb, azure_ai_search, cosmosdb]db_uri: 'output/lancedb'collection_name: defaultoverwrite: truellm:
api_key: lm-studiotype: openai_embedding # or azure_openai_embedding#model: quentinz/bge-large-zh-v1.5#api_base: http://192.168.2.131:11434/apimodel: text-embedding-bge-m3api_base: http://192.168.2.131:1234/v1#model: BAAI/bge-m3#api_base: http://192.168.2.131:30000/v1/max_tokens: 1024# api_version: 2024-02-15-preview# audience: "https://cognitiveservices.azure.com/.default"# organization: <organization_id># deployment_name: <azure_model_deployment_name>### Input settings ###input:
type: file # or blobfile_type: text # or csvbase_dir: "input"file_encoding: utf-8file_pattern: ".*\\.txt$"chunks:
size: 4096overlap: 100group_by_columns: [id]### Storage settings ##### If blob storage is specified in the following four sections,## connection_string and container_name must be providedcache:
type: file # one of [blob, cosmosdb, file]base_dir: "cache"reporting:
type: file # or console, blobbase_dir: "logs"storage:
type: file # one of [blob, cosmosdb, file]base_dir: "output"## only turn this on if running `graphrag index` with custom settings## we normally use `graphrag update` with the defaultsupdate_index_storage:
# type: file # or blob# base_dir: "update_output"### Workflow settings ###skip_workflows: []entity_extraction:
prompt: "prompts/entity_extraction.txt"entity_types: [organization,person,geo,event]max_gleanings: 1summarize_descriptions:
prompt: "prompts/summarize_descriptions.txt"max_length: 500claim_extraction:
enabled: falseprompt: "prompts/claim_extraction.txt"description: "Any claims or facts that could be relevant to information discovery."max_gleanings: 1community_reports:
prompt: "prompts/community_report.txt"max_length: 2000max_input_length: 4000cluster_graph:
max_cluster_size: 10embed_graph:
enabled: true # if true, will generate node2vec embeddings for nodesumap:
enabled: true # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)snapshots:
graphml: trueembeddings: truetransient: true### Query settings ##### The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.## See the config docs: https://microsoft.github.io/graphrag/config/yaml/#querylocal_search:
prompt: "prompts/local_search_system_prompt.txt"global_search:
map_prompt: "prompts/global_search_map_system_prompt.txt"reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"drift_search:
prompt: "prompts/drift_search_system_prompt.txt"reduce_prompt: "prompts/drift_search_reduce_prompt.txt"basic_search:
prompt: "prompts/basic_search_system_prompt.txt"
Logs and screenshots
Additional Information
GraphRAG Version:1.2.0
Operating System:ubuntu
Python Version:3.11
Related Issues:
The text was updated successfully, but these errors were encountered:
lingfan
added
the
triage
Default label assignment, indicates new issue needs reviewed by a maintainer
label
Feb 13, 2025
Do you need to file an issue?
Describe the issue
{
"type": "error",
"data": "Error running pipeline!",
"stack": "Traceback (most recent call last):\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/run/run_workflows.py", line 166, in _run_workflows\n result = await run_workflow(\n ^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/workflows/extract_graph.py", line 45, in run_workflow\n base_entity_nodes, base_relationship_edges = await extract_graph(\n ^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/flows/extract_graph.py", line 33, in extract_graph\n entities, relationships = await extract_entities(\n ^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/operations/extract_entities/extract_entities.py", line 137, in extract_entities\n relationships = _merge_relationships(relationship_dfs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/graphrag/index/operations/extract_entities/extract_entities.py", line 178, in _merge_relationships\n .agg(\n ^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/groupby/generic.py", line 1432, in aggregate\n result = op.agg()\n ^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 190, in agg\n return self.agg_dict_like()\n ^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 423, in agg_dict_like\n return self.agg_or_apply_dict_like(op_name="agg")\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 1608, in agg_or_apply_dict_like\n result_index, result_data = self.compute_dict_like(\n ^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 462, in compute_dict_like\n func = self.normalize_dictlike_arg(op_name, selected_obj, func)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/conda/lib/python3.11/site-packages/pandas/core/apply.py", line 663, in normalize_dictlike_arg\n raise KeyError(f"Column(s) {list(cols)} do not exist")\nKeyError: "Column(s) ['description', 'source_id', 'weight'] do not exist"\n",
"source": ""Column(s) ['description', 'source_id', 'weight'] do not exist"",
"details": null
}
Steps to reproduce
No response
GraphRAG Config Used
Logs and screenshots
Additional Information
The text was updated successfully, but these errors were encountered: