Skip to content

Commit

Permalink
Update leaderboard-livecodebench.md (huggingface#2003)
Browse files Browse the repository at this point in the history
Space cannot be embedded
  • Loading branch information
clefourrier authored Apr 17, 2024
1 parent d0ae291 commit 223cdbe
Showing 1 changed file with 2 additions and 6 deletions.
8 changes: 2 additions & 6 deletions leaderboard-livecodebench.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,7 @@ authors:

# Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

We are excited to introduce the LiveCodeBench leaderboard, based on LiveCodeBench, a new benchmark developed by researchers from UC Berkeley, MIT, and Cornell for measuring LLMs’ code generation capabilities.

<script type="module" src="https://gradio.s3-us-west-2.amazonaws.com/3.45.1/gradio.js"> </script>
<gradio-app theme_mode="light" space="livecodebench/leaderboard"></gradio-app>

We are excited to introduce the [LiveCodeBench leaderboard](https://huggingface.co/spaces/livecodebench/leaderboard), based on LiveCodeBench, a new benchmark developed by researchers from UC Berkeley, MIT, and Cornell for measuring LLMs’ code generation capabilities.

LiveCodeBench collects coding problems over time from various coding contest platforms, annotating problems with their release dates. Annotations are used to evaluate models on problem sets released in different time windows, allowing an “evaluation over time” strategy that helps detect and prevent contamination. In addition to the usual code generation task, LiveCodeBench also assesses self-repair, test output prediction, and code execution, thus providing a more holistic view of coding capabilities required for the next generation of AI programming agents.

Expand Down Expand Up @@ -87,4 +83,4 @@ for different scenarios. For new model families, we have implemented an extensib


## How to contribute
Finally, we are looking for collaborators and suggestions for LiveCodeBench. The [dataset](https://huggingface.co/livecodebench) and [code](https://github.com/LiveCodeBench/LiveCodeBench) are available online, so please reach out by submitting an issue or [mail](mailto:[email protected]).
Finally, we are looking for collaborators and suggestions for LiveCodeBench. The [dataset](https://huggingface.co/livecodebench) and [code](https://github.com/LiveCodeBench/LiveCodeBench) are available online, so please reach out by submitting an issue or [mail](mailto:[email protected]).

0 comments on commit 223cdbe

Please sign in to comment.