Skip to content

Commit

Permalink
deploy: f1d1583
Browse files Browse the repository at this point in the history
  • Loading branch information
waleko committed Sep 17, 2023
1 parent ff0f1a7 commit 83a40ea
Show file tree
Hide file tree
Showing 5 changed files with 44 additions and 77 deletions.
45 changes: 20 additions & 25 deletions README.html
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ <h2> Contents </h2>
<nav aria-label="Page">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#overview">Overview</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#key-features">Key Features</a><ul class="nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#contents">Contents</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#data-collection">1. Data Collection</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#model-inference-and-fine-tuning">2. Model Inference and Fine-Tuning</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#evaluation-metrics">3. Evaluation Metrics</a></li>
Expand All @@ -347,58 +347,53 @@ <h2> Contents </h2>

<section class="tex2jax_ignore mathjax_ignore" id="code-review-automation-with-language-models">
<h1>Code Review Automation with Language Models<a class="headerlink" href="#code-review-automation-with-language-models" title="Permalink to this heading">#</a></h1>
<p><a class="reference external" href="https://alexkovrigin.me/Code-Review-Automation-LM"><img alt="Static Badge" src="https://img.shields.io/badge/docs-available-orange?style=flat-square" /></a>
<a class="reference external" href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square" /></a></p>
<p><a class="reference external" href="https://alexkovrigin.me/Code-Review-Automation-LM"><img alt="Static Badge" src="https://img.shields.io/badge/jupyter-book-orange?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABwAAAAZCAMAAAAVHr4VAAAAXVBMVEX////v7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/v7+/zdybv7+/zdybv7+/v7+/zdybv7+/zdybv7+/zdyaSmqV2AAAAHXRSTlMAEBAgIDAwQEBQUGBgcHCAgJCQoLCwwMDQ4ODw8MDkUIUAAADJSURBVHjaddAFkgNBCAXQP+7uAvc/5tLFVseYF8crUB0560r/5gwvjYYm8gq8QJoyIJNwlnUH0WEnART6YSezV6c5tjOTaoKdfGXtnclFlEBEXVd8JzG4pa/LDql9Jff/ZCC/h2zSqF5bzf4vqkgNwEzeClUd8uMadLE6OnhBFsES5niQh2BOYUqZsfGdmrmbN+TMvPROHUOkde8sEs6Bnr0tDDf2Roj6fmVfubuGyttejCeLc+xFm+NLuLnJeFAyl3gS932MF/wBoukfUcwI05kAAAAASUVORK5CYII=&amp;style=for-the-badge" /></a></p>
<section id="overview">
<h2>Overview<a class="headerlink" href="#overview" title="Permalink to this heading">#</a></h2>
<p>Code review is a crucial aspect of the software development process, ensuring that code changes are thoroughly examined
for quality, security, and adherence to coding standards. However, the code review process can be time-consuming, and
human reviewers may overlook certain issues. To address these challenges, we have developed a Code Review Automation
system powered by language models.</p>
<p>Our system leverages state-of-the-art language models to generate code reviews automatically. These models are trained
on a vast corpus of code and can provide insightful feedback on code changes. By automating part of the code review
process, our system aims to:</p>
<ul class="simple">
<li><p>Speed up the code review process.</p></li>
<li><p>Identify common code issues and provide recommendations.</p></li>
<li><p>Assist developers in producing higher-quality code.</p></li>
</ul>
human reviewers may overlook certain issues.</p>
<p>In this series of Jupyter notebooks, we embark on a journey to collect code review data from GitHub repositories and
perform code review predictions using a prominent language model: <a class="reference external" href="https://arxiv.org/abs/2203.09095">CodeReviewer</a> from
Microsoft Research. Our primary goal is to explore the capabilities of this model in generating code reviews and
evaluate its performance.</p>
</section>
<section id="key-features">
<h2>Key Features<a class="headerlink" href="#key-features" title="Permalink to this heading">#</a></h2>
<section id="contents">
<h2>Contents<a class="headerlink" href="#contents" title="Permalink to this heading">#</a></h2>
<section id="data-collection">
<h3>1. Data Collection<a class="headerlink" href="#data-collection" title="Permalink to this heading">#</a></h3>
<p>Our system collects code review data from popular GitHub repositories. This data includes code changes and associated
human-authored code reviews. By leveraging this data, our models learn to generate contextually relevant code reviews.</p>
<p>First, we collect the code review data from popular GitHub repositories. This data includes code changes and associated
human-authored code reviews. By leveraging this data, the model learns to generate contextually relevant code reviews.</p>
</section>
<section id="model-inference-and-fine-tuning">
<h3>2. Model Inference and Fine-Tuning<a class="headerlink" href="#model-inference-and-fine-tuning" title="Permalink to this heading">#</a></h3>
<p>We use pre-trained language models and fine-tune them on code review datasets. Fine-tuning allows the models to
<p>We take the pre-trained language checkpoint and fine-tune the model on code review datasets. Fine-tuning allows the models to
specialize in generating code reviews, making them more effective in this task.</p>
<p>Once the models are trained, they can generate code reviews for new code changes. These generated reviews can highlight
potential issues, suggest improvements, and provide feedback to developers.</p>
</section>
<section id="evaluation-metrics">
<h3>3. Evaluation Metrics<a class="headerlink" href="#evaluation-metrics" title="Permalink to this heading">#</a></h3>
<p>We use the BLEU-4 score metric to assess the quality of generated code reviews. This metric measures the similarity
between model-generated reviews and target human reviews. While our models provide valuable assistance, they are
designed to complement human reviewers.</p>
between model-generated reviews and target human reviews.</p>
</section>
</section>
<section id="getting-started">
<h2>Getting Started<a class="headerlink" href="#getting-started" title="Permalink to this heading">#</a></h2>
<p>To get started with our Code Review Automation system, follow these steps:</p>
<p>To get started with our work, follow these steps:</p>
<ol class="arabic">
<li><p>Clone this repository to your local machine:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>git<span class="w"> </span>clone<span class="w"> </span>https://github.com/waleko/Code-Review-Automation-LM.git
<span class="nb">cd</span><span class="w"> </span>Code-Review-Automation-LM
</pre></div>
</div>
</li>
<li><p>Set up the required dependencies and environment (see <code class="docutils literal notranslate"><span class="pre">requirements.txt</span></code>).</p></li>
<li><p>Set up the required dependencies from <code class="docutils literal notranslate"><span class="pre">requirements.txt</span></code>. E.g.: using <code class="docutils literal notranslate"><span class="pre">pip</span></code>:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span>-r<span class="w"> </span>requirements.txt
</pre></div>
</div>
</li>
<li><p>Run the provided notebooks to explore data collection, model inference, and evaluation.</p></li>
<li><p>Integrate the code review automation system into your development workflow. You can use our pre-trained models or
fine-tune them on your specific codebase for even better results.</p></li>
</ol>
</section>
<section id="license">
Expand Down Expand Up @@ -457,7 +452,7 @@ <h2>Contact<a class="headerlink" href="#contact" title="Permalink to this headin
<nav class="bd-toc-nav page-toc">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#overview">Overview</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#key-features">Key Features</a><ul class="nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#contents">Contents</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#data-collection">1. Data Collection</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#model-inference-and-fine-tuning">2. Model Inference and Fine-Tuning</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#evaluation-metrics">3. Evaluation Metrics</a></li>
Expand Down
39 changes: 17 additions & 22 deletions _sources/README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,28 @@
# Code Review Automation with Language Models

[![Static Badge](https://img.shields.io/badge/docs-available-orange?style=flat-square)](https://alexkovrigin.me/Code-Review-Automation-LM)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/psf/black)
[![Static Badge](https://img.shields.io/badge/jupyter-book-orange?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABwAAAAZCAMAAAAVHr4VAAAAXVBMVEX////v7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/zdybv7+/v7+/zdybv7+/zdybv7+/v7+/zdybv7+/zdybv7+/zdyaSmqV2AAAAHXRSTlMAEBAgIDAwQEBQUGBgcHCAgJCQoLCwwMDQ4ODw8MDkUIUAAADJSURBVHjaddAFkgNBCAXQP+7uAvc/5tLFVseYF8crUB0560r/5gwvjYYm8gq8QJoyIJNwlnUH0WEnART6YSezV6c5tjOTaoKdfGXtnclFlEBEXVd8JzG4pa/LDql9Jff/ZCC/h2zSqF5bzf4vqkgNwEzeClUd8uMadLE6OnhBFsES5niQh2BOYUqZsfGdmrmbN+TMvPROHUOkde8sEs6Bnr0tDDf2Roj6fmVfubuGyttejCeLc+xFm+NLuLnJeFAyl3gS932MF/wBoukfUcwI05kAAAAASUVORK5CYII=&style=for-the-badge)](https://alexkovrigin.me/Code-Review-Automation-LM)

## Overview

Code review is a crucial aspect of the software development process, ensuring that code changes are thoroughly examined
for quality, security, and adherence to coding standards. However, the code review process can be time-consuming, and
human reviewers may overlook certain issues. To address these challenges, we have developed a Code Review Automation
system powered by language models.
human reviewers may overlook certain issues.

Our system leverages state-of-the-art language models to generate code reviews automatically. These models are trained
on a vast corpus of code and can provide insightful feedback on code changes. By automating part of the code review
process, our system aims to:
In this series of Jupyter notebooks, we embark on a journey to collect code review data from GitHub repositories and
perform code review predictions using a prominent language model: [CodeReviewer](https://arxiv.org/abs/2203.09095) from
Microsoft Research. Our primary goal is to explore the capabilities of this model in generating code reviews and
evaluate its performance.

- Speed up the code review process.
- Identify common code issues and provide recommendations.
- Assist developers in producing higher-quality code.

## Key Features
## Contents

### 1. Data Collection

Our system collects code review data from popular GitHub repositories. This data includes code changes and associated
human-authored code reviews. By leveraging this data, our models learn to generate contextually relevant code reviews.
First, we collect the code review data from popular GitHub repositories. This data includes code changes and associated
human-authored code reviews. By leveraging this data, the model learns to generate contextually relevant code reviews.

### 2. Model Inference and Fine-Tuning

We use pre-trained language models and fine-tune them on code review datasets. Fine-tuning allows the models to
We take the pre-trained language checkpoint and fine-tune the model on code review datasets. Fine-tuning allows the models to
specialize in generating code reviews, making them more effective in this task.

Once the models are trained, they can generate code reviews for new code changes. These generated reviews can highlight
Expand All @@ -36,12 +31,11 @@ potential issues, suggest improvements, and provide feedback to developers.
### 3. Evaluation Metrics

We use the BLEU-4 score metric to assess the quality of generated code reviews. This metric measures the similarity
between model-generated reviews and target human reviews. While our models provide valuable assistance, they are
designed to complement human reviewers.
between model-generated reviews and target human reviews.

## Getting Started

To get started with our Code Review Automation system, follow these steps:
To get started with our work, follow these steps:

1. Clone this repository to your local machine:

Expand All @@ -50,12 +44,13 @@ To get started with our Code Review Automation system, follow these steps:
cd Code-Review-Automation-LM
```

2. Set up the required dependencies and environment (see `requirements.txt`).
2. Set up the required dependencies from `requirements.txt`. E.g.: using `pip`:

3. Run the provided notebooks to explore data collection, model inference, and evaluation.
```bash
pip install -r requirements.txt
```

4. Integrate the code review automation system into your development workflow. You can use our pre-trained models or
fine-tune them on your specific codebase for even better results.
3. Run the provided notebooks to explore data collection, model inference, and evaluation.

## License

Expand Down
19 changes: 3 additions & 16 deletions _sources/docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,13 @@
## Introduction

In this series of Jupyter notebooks, we embark on a journey to collect code review data from GitHub repositories and
perform code review predictions using language models. Our primary goal is to explore the capabilities of different
models in generating code reviews and evaluate their performance.
perform code review predictions using language models. Our primary goal is to explore the capabilities of the [CodeReviewer](https://arxiv.org/abs/2203.09095) model in generating code reviews and evaluate its performance.

### Collecting Code Review Data

In this initial notebook, we dive into the process of collecting code review data from GitHub repositories. We leverage
the PyGithub library to interact with the GitHub API, ensuring seamless data retrieval.

We establish a function to collect code review data from a GitHub repository, allowing us to specify parameters such as
the number of comments to load, skipping author comments, and more. The collected data is structured into a Pandas
DataFrame for further analysis and processing.

Three prominent repositories, namely `microsoft/vscode`, `JetBrains/kotlin`, and `transloadit/uppy`, are selected for
data collection due to their popularity and rich code review history. Additionally, we are going to use data from the
original CodeReviewer dataset `msg-test` that is provided by the authors of {cite}`li2022codereviewer`.
Expand All @@ -24,9 +19,6 @@ original CodeReviewer dataset `msg-test` that is provided by the authors of {cit
The second notebook focuses on generating code reviews using the `microsoft/codereviewer` model. We delve into the
tokenization and dataset preparation process, emphasizing the importance of specialized tokens.

A custom `ReviewsDataset` class is introduced to facilitate data loading and transformation, making it compatible with
model inference. We load data from various sources, creating DataLoader instances for efficient model input.

We explore the model inference process, employing both a HuggingFace pre-trained checkpoint and a fine-tuned
CodeReviewer model. The fine-tuning process details are outlined, showcasing parameters and resources used. Model
predictions are saved.
Expand All @@ -37,15 +29,10 @@ In this notebook, we assess the quality of code review predictions generated by
fine-tuned models are evaluated across different datasets, shedding light on their performance.

Qualitative assessment is conducted to gain insights into how the models generate code reviews. We present samples of
code, along with predictions from both models, enabling a visual comparison with human reviews. This helps in
understanding the nuances of model-generated reviews.
code, along with predictions from both models, enabling a visual comparison with human reviews.

Lastly, we quantitatively evaluate the models' performance using BLEU-4 scores. We calculate scores for each dataset,
providing a comprehensive overview of how well the models align with human reviews. This quantitative analysis helps in
drawing conclusions about the effectiveness of the models in code review prediction.

Throughout this journey, we aim to explore the capabilities and limitations of language models in the context of code
review, shedding light on their potential applications and areas for improvement.
providing a comprehensive overview of how well the models align with human reviews.

## Table of Contents

Expand Down
Loading

0 comments on commit 83a40ea

Please sign in to comment.