lighteval script failed #468

foamliu · 2025-03-04T09:51:48Z

The attempt to execute the following command failed.

MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
MODEL_ARGS="pretrained=$MODEL,dtype=bfloat16,max_model_length=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}"
OUTPUT_DIR=data/evals/$MODEL

# AIME 2024
TASK=aime24
lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
    --custom-tasks src/open_r1/evaluate.py \
    --use-chat-template \
    --output-dir $OUTPUT_DIR

Here is the error message.

[2025-03-04 17:43:42,101] [    INFO]: PyTorch version 2.5.1 available. (config.py:54)
INFO 03-04 17:43:49 __init__.py:190] Automatically detected platform cuda.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /data1/liuyang/code/lighteval-main/src/lighteval/main_vllm.py:105 in vllm                        │
│                                                                                                  │
│   102 │   from lighteval.logging.evaluation_tracker import EvaluationTracker                     │
│   103 │   from lighteval.models.model_input import GenerationParameters                          │
│   104 │   from lighteval.models.vllm.vllm_model import VLLMModelConfig                           │
│ ❱ 105 │   from lighteval.pipeline import EnvConfig, ParallelismManager, Pipeline, PipelinePara   │
│   106 │                                                                                          │
│   107 │   TOKEN = os.getenv("HF_TOKEN")                                                          │
│   108                                                                                            │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/pipeline.py:48 in <module>                      │
│                                                                                                  │
│    45 │   ModelResponse,                                                                         │
│    46 )                                                                                          │
│    47 from lighteval.tasks.lighteval_task import LightevalTask, create_requests_from_tasks       │
│ ❱  48 from lighteval.tasks.registry import Registry, taskinfo_selector                           │
│    49 from lighteval.tasks.requests import RequestType, SampleUid                                │
│    50 from lighteval.utils.imports import (                                                      │
│    51 │   NO_ACCELERATE_ERROR_MSG,                                                               │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/tasks/registry.py:36 in <module>                │
│                                                                                                  │
│    33 from datasets.load import dataset_module_factory                                           │
│    34                                                                                            │
│    35 import lighteval.tasks.default_tasks as default_tasks                                      │
│ ❱  36 from lighteval.tasks.extended import AVAILABLE_EXTENDED_TASKS_MODULES                      │
│    37 from lighteval.tasks.lighteval_task import LightevalTask, LightevalTaskConfig              │
│    38 from lighteval.utils.imports import CANNOT_USE_EXTENDED_TASKS_MSG, can_load_extended_tas   │
│    39                                                                                            │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/tasks/extended/__init__.py:29 in <module>       │
│                                                                                                  │
│   26 if can_load_extended_tasks():                                                               │
│   27 │   import lighteval.tasks.extended.hle.main as hle                                         │
│   28 │   import lighteval.tasks.extended.ifeval.main as ifeval                                   │
│ ❱ 29 │   import lighteval.tasks.extended.lcb.main as lcb                                         │
│   30 │   import lighteval.tasks.extended.mix_eval.main as mix_eval                               │
│   31 │   import lighteval.tasks.extended.mt_bench.main as mt_bench                               │
│   32 │   import lighteval.tasks.extended.olympiade_bench.main as olympiad_bench                  │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/tasks/extended/lcb/main.py:118 in <module>      │
│                                                                                                  │
│   115                                                                                            │
│   116 extend_enum(Metrics, "lcb_codegen_metric", lcb_codegen_metric)                             │
│   117                                                                                            │
│ ❱ 118 configs = get_dataset_config_names("livecodebench/code_generation_lite", trust_remote_co   │
│   119                                                                                            │
│   120 tasks = []                                                                                 │
│   121                                                                                            │
│                                                                                                  │
│ /data1/liuyang/xmodel-r1/openr1/lib/python3.11/site-packages/datasets/inspect.py:174 in          │
│ get_dataset_config_names                                                                         │
│                                                                                                  │
│   171 │   │   **download_kwargs,                                                                 │
│   172 │   )                                                                                      │
│   173 │   builder_cls = get_dataset_builder_class(dataset_module, dataset_name=os.path.basenam   │
│ ❱ 174 │   return list(builder_cls.builder_configs.keys()) or [                                   │
│   175 │   │   dataset_module.builder_kwargs.get("config_name", builder_cls.DEFAULT_CONFIG_NAME   │
│   176 │   ]                                                                                      │
│   177                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'NoneType' object has no attribute 'builder_configs'

The text was updated successfully, but these errors were encountered:

1058441072 · 2025-03-04T13:25:56Z

Have you found a way to solve this?

realzhukaihua · 2025-03-04T15:48:48Z

make evaluate MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B TASK=aime24

MODEL_ARGS="pretrained=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B,dtype=bfloat16,,max_model_length=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}" &&
if [ "aime24" = "lcb" ]; then
lighteval vllm $MODEL_ARGS "extended|lcb:codegeneration|0|0"
--use-chat-template
--output-dir data/evals/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B;
else
lighteval vllm $MODEL_ARGS "custom|aime24|0|0"
--custom-tasks src/open_r1/evaluate.py
--use-chat-template
--output-dir data/evals/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B;
fi
[2025-03-04 15:39:51,073] [ INFO]: PyTorch version 2.5.1 available. (config.py:54)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/kaihua/open-r1/openr1/lib/python3.11/site-packages/lighteval/main_vllm.py:147 in vllm │
│ │
│ 144 │ │ metric_options = {} │
│ 145 │ │
│ 146 │ model_args_dict: dict = {k.split("=")[0]: k.split("=")[1] if "=" in k else True for │
│ ❱ 147 │ model_config = VLLMModelConfig(**model_args_dict, generation_parameters=generation_p │
│ 148 │ │
│ 149 │ pipeline = Pipeline( │
│ 150 │ │ tasks=tasks, │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: VLLMModelConfig.init() got an unexpected keyword argument ''

I got this error

foamliu · 2025-03-05T03:58:04Z

Have you found a way to solve this?

I've been trying for the past two days, but I still haven't been able to resolve it.

yaguanghu · 2025-03-06T01:07:36Z

same problem +1

Liar-Mask · 2025-03-06T02:02:40Z

Same problem, an I hope it can be solved soon.

Liar-Mask · 2025-03-06T12:43:28Z

The problem did not occur after I run GIT_LFS_SKIP_SMUDGE=1 uv pip install -e ".[dev]".
It seems to be related to the version of lighteval or other libraries.

Cppowboy · 2025-03-10T03:56:43Z

Have you find any solution to this problem?

buffliu · 2025-03-10T10:56:06Z

Have you find any solution to this problem?

shizhediao · 2025-03-11T01:34:39Z

Same problem...

buffliu · 2025-03-11T06:13:36Z

have anybody solved this issue?

buffliu · 2025-03-11T09:24:05Z

revert lighteval code to a specific commit, 066f84f712c26be51a52979e08fa438d29ac4d35, and pip install -e .

cuda12.4

tokenizers 0.21.0
torch 2.5.1
torchaudio 2.5.1
torchvision 0.20.1
tqdm 4.67.1
transformers 4.49.0
triton 3.1.0
trl 0.9.3
typeguard 4.4.2
typepy 1.3.4
typer 0.9.4
typing_extensions 4.12.2
tyro 0.9.16
tzdata 2025.1
urllib3 2.3.0
uvicorn 0.34.0
uvloop 0.21.0
virtualenv 20.29.3
vllm 0.7.3
wandb 0.19.8
wasabi 1.1.3
watchfiles 1.0.4
wcwidth 0.2.13
weasel 0.3.4
websockets 15.0.1
wrapt 1.17.2
xformers 0.0.28.post3

Cppowboy · 2025-03-12T04:15:40Z

I think this maybe a problem of datasets library or the lcb dataset. I solve this problem by hacking the code in src/lighteval/tasks/extended/lcb/main.py:118

# configs = get_dataset_config_names("livecodebench/code_generation_lite", trust_remote_code=True)
configs = ['release_v1', 'release_v2', 'release_v3', 'release_v4', 'release_v5', 'release_latest', 'v1', 'v2', 'v3', 'v4', 'v5', 'v1_v2', 'v1_v3', 'v1_v4', 'v1_v5', 'v2_v3', 'v2_v4', 'v2_v5', 'v3_v4', 'v3_v5', 'v4_v5']

It can make the evaluation script work but I really do not think this is a good solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lighteval script failed #468

lighteval script failed #468

foamliu commented Mar 4, 2025

1058441072 commented Mar 4, 2025

realzhukaihua commented Mar 4, 2025

foamliu commented Mar 5, 2025

yaguanghu commented Mar 6, 2025

Liar-Mask commented Mar 6, 2025

Liar-Mask commented Mar 6, 2025

Cppowboy commented Mar 10, 2025

buffliu commented Mar 10, 2025

shizhediao commented Mar 11, 2025

buffliu commented Mar 11, 2025

buffliu commented Mar 11, 2025 •

edited

Loading

Cppowboy commented Mar 12, 2025 •

edited

Loading

lighteval script failed #468

lighteval script failed #468

Comments

foamliu commented Mar 4, 2025

1058441072 commented Mar 4, 2025

realzhukaihua commented Mar 4, 2025

foamliu commented Mar 5, 2025

yaguanghu commented Mar 6, 2025

Liar-Mask commented Mar 6, 2025

Liar-Mask commented Mar 6, 2025

Cppowboy commented Mar 10, 2025

buffliu commented Mar 10, 2025

shizhediao commented Mar 11, 2025

buffliu commented Mar 11, 2025

buffliu commented Mar 11, 2025 • edited Loading

Cppowboy commented Mar 12, 2025 • edited Loading

buffliu commented Mar 11, 2025 •

edited

Loading

Cppowboy commented Mar 12, 2025 •

edited

Loading