This is the code of the paper 'Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation' (Accepted by AAAI 2025).
conda create -n amar python=3.8
pip install -r requirement.txt
Both datasets use Freebase as the knowledge source. You may refer to Freebase Virtuoso Setup to set up a Virtuoso triplestore service. We briefly list some key steps below:
Download OpenLink Virtuoso from https://github.com/openlink/virtuoso-opensource/releases, and put it in Amar/
Env setting:
sudo apt install unixodbc unixodbc-dev
Download Database:
cd Freebase-Setup
wget https://www.dropbox.com/s/q38g0fwx1a3lz8q/virtuoso_db.zip
tar -zxvf virtuoso_db.zip
to start the Virtuoso service:
python3 virtuoso.py start 3001 -d virtuoso_db
and to stop a currently running service at the same port:
python3 virtuoso.py stop 3001
Download from Google drive or Baidu drive, and unzip data.zip to Amar/data
More details of entity/relation retrieval can be found in GMT-KBQA, and subgraph retrieval can be found in DECAF.
Change the --model_name_or_path
in run_ft.sh
to your LLM checkpoint path.
Reproduce the results for CWQ and WebQSP by executing the following:
bash run_all.sh
Alternatively, you can run the commands step-by-step as shown below:
CUDA_VISIBLE_DEVICES=0 bash run_ft.sh WebQSP LLaMA-2-7b-hf webqsp_100_7_32_16 train 100 7 0 32 16 15
CUDA_VISIBLE_DEVICES=0 bash run_ft.sh CWQ LLaMA-2-13b-hf cwq_4_16_32_16 train 4 16 0 32 16 8
CUDA_VISIBLE_DEVICES=0 bash run_ft.sh WebQSP LLaMA-2-7b-hf webqsp_100_7_32_16 test 100 7 0 32 16 15
CUDA_VISIBLE_DEVICES=0 bash run_ft.sh CWQ LLaMA-2-13b-hf cwq_4_16_32_16 test 4 16 0 32 16 8
CUDA_VISIBLE_DEVICES=0 python -u eval_final.py --dataset WebQSP --pred_file Reading/LLaMA-2-7b-hf/WebQSP_webqsp_100_7_32_16/evaluation_beam/beam_test_top_k_predictions.json
CUDA_VISIBLE_DEVICES=0 python -u eval_final.py --dataset CWQ --pred_file Reading/LLaMA-2-13b-hf/CWQ_cwq_4_16_32_16/evaluation_beam/beam_test_top_k_predictions.json
Querying with golden entity:
CUDA_VISIBLE_DEVICES=0 python -u eval_final.py --dataset WebQSP --pred_file Reading/LLaMA-2-7b-hf/WebQSP_webqsp_100_7_32_16/evaluation_beam/beam_test_top_k_predictions.json --golden_ent
CUDA_VISIBLE_DEVICES=0 python -u eval_final.py --dataset CWQ --pred_file Reading/LLaMA-2-13b-hf/CWQ_cwq_4_16_32_16/evaluation_beam/beam_test_top_k_predictions.json --golden_ent
This repo refers to ChatKBQA, GMT-KBQA and DECAF. Thanks for their great jobs!