Skip to content

Commit

Permalink
add docs definitely not generated by chatgpt
Browse files Browse the repository at this point in the history
  • Loading branch information
waleko committed Sep 17, 2023
1 parent b691824 commit b19ad4a
Show file tree
Hide file tree
Showing 4 changed files with 130 additions and 12 deletions.
66 changes: 64 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,66 @@
# CodeReviewer ML Performance
# Code Review Automation with Language Models

![Static Badge](https://img.shields.io/badge/docs-available-orange?style=flat-square)
[![Static Badge](https://img.shields.io/badge/docs-available-orange?style=flat-square)](https://alexkovrigin.me/Code-Review-Automation-LM)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/psf/black)

## Overview

Code review is a crucial aspect of the software development process, ensuring that code changes are thoroughly examined
for quality, security, and adherence to coding standards. However, the code review process can be time-consuming, and
human reviewers may overlook certain issues. To address these challenges, we have developed a Code Review Automation
system powered by language models.

Our system leverages state-of-the-art language models to generate code reviews automatically. These models are trained
on a vast corpus of code and can provide insightful feedback on code changes. By automating part of the code review
process, our system aims to:

- Speed up the code review process.
- Identify common code issues and provide recommendations.
- Assist developers in producing higher-quality code.

## Key Features

### 1. Data Collection

Our system collects code review data from popular GitHub repositories. This data includes code changes and associated
human-authored code reviews. By leveraging this data, our models learn to generate contextually relevant code reviews.

### 2. Model Inference and Fine-Tuning

We use pre-trained language models and fine-tune them on code review datasets. Fine-tuning allows the models to
specialize in generating code reviews, making them more effective in this task.

Once the models are trained, they can generate code reviews for new code changes. These generated reviews can highlight
potential issues, suggest improvements, and provide feedback to developers.

### 3. Evaluation Metrics

We use the BLEU-4 score metric to assess the quality of generated code reviews. This metric measures the similarity
between model-generated reviews and target human reviews. While our models provide valuable assistance, they are
designed to complement human reviewers.

## Getting Started

To get started with our Code Review Automation system, follow these steps:

1. Clone this repository to your local machine:

```bash
git clone https://github.com/waleko/Code-Review-Automation-LM.git
cd Code-Review-Automation-LM
```

2. Set up the required dependencies and environment (see `requirements.txt`).

3. Run the provided notebooks to explore data collection, model inference, and evaluation.

4. Integrate the code review automation system into your development workflow. You can use our pre-trained models or
fine-tune them on your specific codebase for even better results.

## License

This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.

## Contact

For any questions or inquiries, please contact [[email protected]](mailto:[email protected]).
7 changes: 3 additions & 4 deletions _config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Book settings
# Learn more at https://jupyterbook.org/customize/config.html

title: CodeReviewer ML Performance
title: Code Review Automation with Language Models
author: Alexander Kovrigin
copyright: "2023"

Expand All @@ -21,9 +21,8 @@ bibtex_bibfiles:

# Information about where the book exists on the web
repository:
url: https://github.com/waleko/CodeReviewer-ML-Performance # Online location of your book
path_to_book: docs # Optional path to your book, relative to the repository root
branch: main # Which branch of the repository should be used when creating links (optional)
url: https://github.com/waleko/Code-Review-Automation-LM # Online location of your book
branch: gh-pages # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
Expand Down
17 changes: 17 additions & 0 deletions docs/conclusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Conclusion
In our exploration of code review data collection and model inference, we have gained valuable insights into the capabilities and limitations of language models in the context of code review. This journey has encompassed various notebooks, each focusing on a specific aspect of the process. Here, we summarize our key findings and the implications of our work:

- Language models show promise in generating code reviews, but there is ample room for improvement in terms of review quality, context, and relevance.

- Fine-tuning models on code review datasets is a valuable approach to enhance their performance, but further research is needed to optimize fine-tuning techniques.

- While models can assist in code reviews, they should be viewed as complementary tools to human reviewers rather than replacements. Human expertise remains invaluable in the code review process.

- Future work may involve exploring more advanced language models, experimenting with different fine-tuning strategies, and incorporating user feedback to refine predictions.

In conclusion, our journey through code review data collection and model inference has provided valuable insights into the potential of language models in code review automation. While challenges remain, these models have the potential to augment the code review process, helping developers produce higher-quality code. As technology continues to advance, we anticipate exciting developments in this field and a continued focus on improving the effectiveness of code review automation.

## Bibliography

```{bibliography}
```
52 changes: 46 additions & 6 deletions docs/intro.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,51 @@
# CodeReviewer ML Performance
# Code Review Automation with Language Models

This is a small sample book to give you a feel for how book content is
structured.
It shows off a few of the major file types, as well as some sample content.
It does not go in-depth into any particular topic - check out [the Jupyter Book documentation](https://jupyterbook.org) for more information.
## Introduction

Check out the content pages bundled with this sample book to see more.
In this series of Jupyter notebooks, we embark on a journey to collect code review data from GitHub repositories and
perform code review predictions using language models. Our primary goal is to explore the capabilities of different
models in generating code reviews and evaluate their performance.

### Collecting Code Review Data

In this initial notebook, we dive into the process of collecting code review data from GitHub repositories. We leverage
the PyGithub library to interact with the GitHub API, ensuring seamless data retrieval.

We establish a function to collect code review data from a GitHub repository, allowing us to specify parameters such as
the number of comments to load, skipping author comments, and more. The collected data is structured into a Pandas
DataFrame for further analysis and processing.

Three prominent repositories, namely `microsoft/vscode`, `JetBrains/kotlin`, and `transloadit/uppy`, are selected for
data collection due to their popularity and rich code review history. Additionally, we are going to use data from the
original CodeReviewer dataset `msg-test` that is provided by the authors of {cite}`li2022codereviewer`.

### CodeReviewer Model Inference

The second notebook focuses on generating code reviews using the `microsoft/codereviewer` model. We delve into the
tokenization and dataset preparation process, emphasizing the importance of specialized tokens.

A custom `ReviewsDataset` class is introduced to facilitate data loading and transformation, making it compatible with
model inference. We load data from various sources, creating DataLoader instances for efficient model input.

We explore the model inference process, employing both a HuggingFace pre-trained checkpoint and a fine-tuned
CodeReviewer model. The fine-tuning process details are outlined, showcasing parameters and resources used. Model
predictions are saved.

### Predictions Evaluation

In this notebook, we assess the quality of code review predictions generated by the models. Both HuggingFace pre-trained and
fine-tuned models are evaluated across different datasets, shedding light on their performance.

Qualitative assessment is conducted to gain insights into how the models generate code reviews. We present samples of
code, along with predictions from both models, enabling a visual comparison with human reviews. This helps in
understanding the nuances of model-generated reviews.

Lastly, we quantitatively evaluate the models' performance using BLEU-4 scores. We calculate scores for each dataset,
providing a comprehensive overview of how well the models align with human reviews. This quantitative analysis helps in
drawing conclusions about the effectiveness of the models in code review prediction.

Throughout this journey, we aim to explore the capabilities and limitations of language models in the context of code
review, shedding light on their potential applications and areas for improvement.

## Table of Contents

Expand Down

0 comments on commit b19ad4a

Please sign in to comment.