repairbench-framework ❤️

Framework to use LLMs for automated program repair.

Supported benchmarks:

Defects4J
GitBug-Java
HumanEval-Java
QuixBugs

For the RepairBench patches and results, please refer to https://github.com/ASSERT-KTH/repairbench

If you use this code, please cite

@techreport{repairbench,
  title={RepairBench: Leaderboard of Frontier Models for Program Repair}, 
  author={André Silva and Martin Monperrus},
  year={2024},
  url={https://arxiv.org/abs/2409.18952}, 
  number = {2409.18952},
  institution = {arXiv},
}

Installation

Requires python3.11 (or latest) and python-poetry.

To setup repairbench-framework, run the following command:

./setup.sh

Note: By default, GitBug-Java will be installed. This benchmark is heavy (requires ~130GiB free). If you do not need to use GitBug-Java you can comment out the commands in setup.sh that refer to it before running the script.

Execution

Be sure to be in the correct environment:

poetry shell

Example of how to generate samples for Defects4J using the instruct strategy:

python generate_samples.py defects4j instruct

Example of how to generate patches for the samples:

python generate_patches.py samples_defects4j_instruct_.jsonl openai-chatcompletion --model-name gpt-4o-mini --n_workers 1 --num_return_sequences 10 --temperature 1.0

Example of how to evaluate the generated patches:

python evaluate_patches.py defects4j candidates_defects4j_instruct_gpt-4o-mini.jsonl.gz openai

Example of how to export the evaluated patches:

python export_results.py defects4j evaluation_defects4j_instruct_openai.jsonl --model_name gpt-4o-mini

Development

How to run tests:

pytest -s tests/

How to lint your code:

black elleelleaime tests *.py

Check out the results

We store all the results (prompts, patches, evaluation) in a separate repository.

Please visit https://github.com/ASSERT-KTH/repairbench for these.

Name		Name	Last commit message	Last commit date
Latest commit History 571 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
cache @ cf98d30		cache @ cf98d30
elleelleaime		elleelleaime
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
elleelleaime.def		elleelleaime.def
evaluate_patches.py		evaluate_patches.py
export_results.py		export_results.py
extractor.jar		extractor.jar
generate_patches.py		generate_patches.py
generate_samples.py		generate_samples.py
gumtree-spoon-ast-diff.jar		gumtree-spoon-ast-diff.jar
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
renovate.json		renovate.json
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

repairbench-framework ❤️

Installation

Execution

Development

Check out the results

About

Releases

Packages

Contributors 13

Languages

ASSERT-KTH/repairbench-framework

Folders and files

Latest commit

History

Repository files navigation

repairbench-framework ❤️

Installation

Execution

Development

Check out the results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 13

Languages

Packages