Replies: 2 comments 1 reply
-
Why does pytorch exists if tensorflow already exists?And why does new model release if llama already exists? |
Beta Was this translation helpful? Give feedback.
-
The reason is simple: if one toolkit cannot satisfy our demands, we develop our own tool to facilitate our research and projects. First, OpenCompass originates from our internal R&D demands. We investigated the implementations in the community last year and found that the open-source solutions could not satisfy our demands. Therefore, we developed our own toolkit. We release our toolkit to facilitate the community, enabling every researcher to choose the appropriate toolkit according to their demands. Open-source is a result other than the reason. What we released is not only software like OpenCompass, but also evaluation benchmarks and methods, such as MMBench, MathBench, CIBench, Prism, and others. Some specific points include:
Secondly, the comparison with cryptographic algorithms is not suitable for LLM evaluation. The basic idea of evaluation is simple: prompt an LLM to generate a response, then check the answer against a reference, and anyone can implement this quickly. Many implementations (like the prompts) in LM-Harness are suboptimal; for example, we could not reproduce the performance of Llama with LM-Harness. In the AI community, everyone has the right to implement an algorithm and framework. |
Beta Was this translation helpful? Give feedback.
-
Why does opencompass exist if lm-harness already exists and is more widely used?, ref: https://github.com/EleutherAI/lm-evaluation-harness
Beta Was this translation helpful? Give feedback.
All reactions