From 223cdbefbf540cf3e99cb54ab2f9460ea99e0ebf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Fourrier?= <22726840+clefourrier@users.noreply.github.com> Date: Wed, 17 Apr 2024 11:07:20 +0200 Subject: [PATCH] Update leaderboard-livecodebench.md (#2003) Space cannot be embedded --- leaderboard-livecodebench.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/leaderboard-livecodebench.md b/leaderboard-livecodebench.md index 5ab7347dea..ea4ebd2216 100644 --- a/leaderboard-livecodebench.md +++ b/leaderboard-livecodebench.md @@ -19,11 +19,7 @@ authors: # Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs -We are excited to introduce the LiveCodeBench leaderboard, based on LiveCodeBench, a new benchmark developed by researchers from UC Berkeley, MIT, and Cornell for measuring LLMs’ code generation capabilities. - - - - +We are excited to introduce the [LiveCodeBench leaderboard](https://huggingface.co/spaces/livecodebench/leaderboard), based on LiveCodeBench, a new benchmark developed by researchers from UC Berkeley, MIT, and Cornell for measuring LLMs’ code generation capabilities. LiveCodeBench collects coding problems over time from various coding contest platforms, annotating problems with their release dates. Annotations are used to evaluate models on problem sets released in different time windows, allowing an “evaluation over time” strategy that helps detect and prevent contamination. In addition to the usual code generation task, LiveCodeBench also assesses self-repair, test output prediction, and code execution, thus providing a more holistic view of coding capabilities required for the next generation of AI programming agents. @@ -87,4 +83,4 @@ for different scenarios. For new model families, we have implemented an extensib ## How to contribute -Finally, we are looking for collaborators and suggestions for LiveCodeBench. The [dataset](https://huggingface.co/livecodebench) and [code](https://github.com/LiveCodeBench/LiveCodeBench) are available online, so please reach out by submitting an issue or [mail](mailto:naman_jain@berkeley.edu). \ No newline at end of file +Finally, we are looking for collaborators and suggestions for LiveCodeBench. The [dataset](https://huggingface.co/livecodebench) and [code](https://github.com/LiveCodeBench/LiveCodeBench) are available online, so please reach out by submitting an issue or [mail](mailto:naman_jain@berkeley.edu).