Can LLMs be Good Graph Judge for Knowledge Graph Construction?

This is the repo for the paper Can LLMs be Good Graph Judge for Knowledge Graph Construction?.

LoRA weights have been released: 🤗 Hugging Face.

News

Accepted to EMNLP 2025 Main Conference! 🎉

Project Structure

.
├── README.md
├── .gitignore
├── LICENSE
├── chat
│   ├── run_chatgpt_entity.py
│   ├── run_chatgpt_triple.py
│   ├── run_chatgpt_gj.py
│   └── run_chatgpt.py
├── datasets
│   ├── GPT4o_mini_result_GenWiki-Hard
│   ├── GPT4o_mini_result_rebel_sub
│   ├── GPT4o_mini_result_SCIERC
│   ├── prepare_KGCom.ipynb
│   └── process_data.ipynb
├── graph_evaluation
│   ├── metrics
│   │   ├── eval.py
│   │   └── graph_matching.py
│   └── eval.sh
├── graphjudger
│   ├── data
│   │   ├── genwiki
│   │   ├── rebel_sub
│   │   └── scierc
│   ├── models
│   │   └── {your lora weights here}
│   ├── bert_classifier_finetune.py
│   ├── bert_classifier_infer.py
│   ├── lora_finetune_genwiki_context.py
│   ├── lora_finetune_rebel_context.py
│   ├── lora_finetune_scierc_context.py
│   ├── lora_infer_batch_case_gj.py
│   ├── lora_infer_batch_naive.py
│   ├── lora_infer_batch.py
│   └── lora_infer.py
└── img
    ├── graphjudger_context.drawio.png
    └── icon.png

Guidance

Files Introduction

datasets contains the used datasets and prepare_KGCom.ipynb to "Generate Instruction for Graph Judgement", "Generate test data for Graph Judgement" and "Filter the generated graphs with Judgement result".
graph_evaluation contains the graph evaluation metrics.
chatcontains the sctipts to prompt naive LLMs.
graph_judger contains the SFT and inference sctipts.

Guidance

ECTD

Run:

python ./chat/run_chatgpt_entity.py

Get the denoised text data and entities. Don't forget to change the iteration parameters in the codes (We can do that iteratively to improve the performance). The results are under ./datasets/GPT4o_mini_result_{dataset}/Iteration{i}.

Then Run:

python ./chat/run_chatgpt_triple.py

Get the generated KG with denoised text data and entities under ./datasets/GPT4o_mini_result_{dataset}/Graph_Iteration{i}.

KASFT

Then you need to employ SFT on the task of graph judgement with your training data ./datasets/GPT4o_mini_result_{dataset}/train.source and your generated graph ./datasets/GPT4o_mini_result_{dataset}/Graph_Iteration{i}/test_generated_graphs.txt. Then you need to generate the training instructions in ./datasets/GPT4o_mini_result_{dataset}/prepare_KGCom.ipynb. You can finish training instructions generation, test data generation and triple filtering here.

So for graph judgement tasks, you need to firstly run the first cell in ./datasets/prepare_KGCom.ipynb, getting the instruction data ./datasets/GPT4o_mini_result_{dataset}/train_instructions_llama.json. Then put that in ./graph_judger/data/{dataset}, and then Run:

cd ./graph_judger
python lora_finetune_{dataset}_context.py

Then we can get the fine-tuned model in ./graph_judger/models/. Then run the second cell of ./datasets/prepare_KGCom.ipynb to generate test data from generated triples. And move that under ./graph_judger/data/{dataset}, which is test_instructions_context_llama2_7b_itr{i}.csv. Put that under ./graph_judger/data/{dataset}. Then Run:

cd ./graph_judger
python lora_infer_batch.py

Then we can get the judging result for every triple we generated, which is pred_instructions_context_llama2_7b_itr{i}.csv. Put that under ./datasets/GPT4o_mini_result_{dataset}/Graph_Iteration{i}.

GJ

Then run the third cell of ./datasets/prepare_KGCom.ipynb to remove inaccurate the triples in ./datasets/GPT4o_mini_result_{dataset}/Graph_Iteration{i}/test_generated_graphs_final.txt with the label we predicted in pred_instructions_context_llama2_7b_itr{i}.csv. Then we will get the final result of our method ./datasets/GPT4o_mini_result_{dataset}/Graph_Iteration{i}/test_generated_graphs_final.txt.

For evaluation, modify the path in ./graph_evaluation/eval.sh to evaluate the result. Don't forget to put the Bert model under ./graph_evaluation/.

cd ./graph_evaluation
bash eval.sh

For naive LLM baselines, Run:

python ./chat/run_chatgpt.py

Cite Us

@misc{huang2025llmsgoodgraphjudge,
      title={Can LLMs be Good Graph Judge for Knowledge Graph Construction?}, 
      author={Haoyu Huang and Chong Chen and Zeang Sheng and Yang Li and Wentao Zhang},
      year={2025},
      eprint={2411.17388},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.17388}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Can LLMs be Good Graph Judge for Knowledge Graph Construction?

News

Project Structure

Guidance

Files Introduction

Guidance

ECTD

KASFT

GJ

Cite Us

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
chat		chat
datasets		datasets
graph_evaluation		graph_evaluation
graph_judger		graph_judger
img		img
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

hhy-huang/GraphJudge

Folders and files

Latest commit

History

Repository files navigation

Can LLMs be Good Graph Judge for Knowledge Graph Construction?

News

Project Structure

Guidance

Files Introduction

Guidance

ECTD

KASFT

GJ

Cite Us

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages