Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lighteval script failed #468

Open
foamliu opened this issue Mar 4, 2025 · 12 comments
Open

lighteval script failed #468

foamliu opened this issue Mar 4, 2025 · 12 comments

Comments

@foamliu
Copy link

foamliu commented Mar 4, 2025

The attempt to execute the following command failed.

MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
MODEL_ARGS="pretrained=$MODEL,dtype=bfloat16,max_model_length=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}"
OUTPUT_DIR=data/evals/$MODEL

# AIME 2024
TASK=aime24
lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
    --custom-tasks src/open_r1/evaluate.py \
    --use-chat-template \
    --output-dir $OUTPUT_DIR

Here is the error message.

[2025-03-04 17:43:42,101] [    INFO]: PyTorch version 2.5.1 available. (config.py:54)
INFO 03-04 17:43:49 __init__.py:190] Automatically detected platform cuda.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /data1/liuyang/code/lighteval-main/src/lighteval/main_vllm.py:105 in vllm                        │
│                                                                                                  │
│   102 │   from lighteval.logging.evaluation_tracker import EvaluationTracker                     │
│   103 │   from lighteval.models.model_input import GenerationParameters                          │
│   104 │   from lighteval.models.vllm.vllm_model import VLLMModelConfig                           │
│ ❱ 105 │   from lighteval.pipeline import EnvConfig, ParallelismManager, Pipeline, PipelinePara   │
│   106 │                                                                                          │
│   107 │   TOKEN = os.getenv("HF_TOKEN")                                                          │
│   108                                                                                            │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/pipeline.py:48 in <module>                      │
│                                                                                                  │
│    45 │   ModelResponse,                                                                         │
│    46 )                                                                                          │
│    47 from lighteval.tasks.lighteval_task import LightevalTask, create_requests_from_tasks       │
│ ❱  48 from lighteval.tasks.registry import Registry, taskinfo_selector                           │
│    49 from lighteval.tasks.requests import RequestType, SampleUid                                │
│    50 from lighteval.utils.imports import (                                                      │
│    51 │   NO_ACCELERATE_ERROR_MSG,                                                               │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/tasks/registry.py:36 in <module>                │
│                                                                                                  │
│    33 from datasets.load import dataset_module_factory                                           │
│    34                                                                                            │
│    35 import lighteval.tasks.default_tasks as default_tasks                                      │
│ ❱  36 from lighteval.tasks.extended import AVAILABLE_EXTENDED_TASKS_MODULES                      │
│    37 from lighteval.tasks.lighteval_task import LightevalTask, LightevalTaskConfig              │
│    38 from lighteval.utils.imports import CANNOT_USE_EXTENDED_TASKS_MSG, can_load_extended_tas   │
│    39                                                                                            │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/tasks/extended/__init__.py:29 in <module>       │
│                                                                                                  │
│   26 if can_load_extended_tasks():                                                               │
│   27 │   import lighteval.tasks.extended.hle.main as hle                                         │
│   28 │   import lighteval.tasks.extended.ifeval.main as ifeval                                   │
│ ❱ 29 │   import lighteval.tasks.extended.lcb.main as lcb                                         │
│   30 │   import lighteval.tasks.extended.mix_eval.main as mix_eval                               │
│   31 │   import lighteval.tasks.extended.mt_bench.main as mt_bench                               │
│   32 │   import lighteval.tasks.extended.olympiade_bench.main as olympiad_bench                  │
│                                                                                                  │
│ /data1/liuyang/code/lighteval-main/src/lighteval/tasks/extended/lcb/main.py:118 in <module>      │
│                                                                                                  │
│   115                                                                                            │
│   116 extend_enum(Metrics, "lcb_codegen_metric", lcb_codegen_metric)                             │
│   117                                                                                            │
│ ❱ 118 configs = get_dataset_config_names("livecodebench/code_generation_lite", trust_remote_co   │
│   119                                                                                            │
│   120 tasks = []                                                                                 │
│   121                                                                                            │
│                                                                                                  │
│ /data1/liuyang/xmodel-r1/openr1/lib/python3.11/site-packages/datasets/inspect.py:174 in          │
│ get_dataset_config_names                                                                         │
│                                                                                                  │
│   171 │   │   **download_kwargs,                                                                 │
│   172 │   )                                                                                      │
│   173 │   builder_cls = get_dataset_builder_class(dataset_module, dataset_name=os.path.basenam   │
│ ❱ 174 │   return list(builder_cls.builder_configs.keys()) or [                                   │
│   175 │   │   dataset_module.builder_kwargs.get("config_name", builder_cls.DEFAULT_CONFIG_NAME   │
│   176 │   ]                                                                                      │
│   177                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'NoneType' object has no attribute 'builder_configs'
@1058441072
Copy link

Have you found a way to solve this?

@realzhukaihua
Copy link

make evaluate MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B TASK=aime24

MODEL_ARGS="pretrained=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B,dtype=bfloat16,,max_model_length=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}" &&
if [ "aime24" = "lcb" ]; then
lighteval vllm $MODEL_ARGS "extended|lcb:codegeneration|0|0"
--use-chat-template
--output-dir data/evals/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B;
else
lighteval vllm $MODEL_ARGS "custom|aime24|0|0"
--custom-tasks src/open_r1/evaluate.py
--use-chat-template
--output-dir data/evals/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B;
fi
[2025-03-04 15:39:51,073] [ INFO]: PyTorch version 2.5.1 available. (config.py:54)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/kaihua/open-r1/openr1/lib/python3.11/site-packages/lighteval/main_vllm.py:147 in vllm │
│ │
│ 144 │ │ metric_options = {} │
│ 145 │ │
│ 146 │ model_args_dict: dict = {k.split("=")[0]: k.split("=")[1] if "=" in k else True for │
│ ❱ 147 │ model_config = VLLMModelConfig(**model_args_dict, generation_parameters=generation_p │
│ 148 │ │
│ 149 │ pipeline = Pipeline( │
│ 150 │ │ tasks=tasks, │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: VLLMModelConfig.init() got an unexpected keyword argument ''

I got this error

@foamliu
Copy link
Author

foamliu commented Mar 5, 2025

Have you found a way to solve this?

I've been trying for the past two days, but I still haven't been able to resolve it.

@yaguanghu
Copy link

same problem +1

@Liar-Mask
Copy link

Same problem, an I hope it can be solved soon.

@Liar-Mask
Copy link

The problem did not occur after I run GIT_LFS_SKIP_SMUDGE=1 uv pip install -e ".[dev]".
It seems to be related to the version of lighteval or other libraries.

@Cppowboy
Copy link

Have you find any solution to this problem?

1 similar comment
@buffliu
Copy link

buffliu commented Mar 10, 2025

Have you find any solution to this problem?

@shizhediao
Copy link

Same problem...

@buffliu
Copy link

buffliu commented Mar 11, 2025

have anybody solved this issue?

@buffliu
Copy link

buffliu commented Mar 11, 2025

revert lighteval code to a specific commit, 066f84f712c26be51a52979e08fa438d29ac4d35, and pip install -e .

cuda12.4

tokenizers 0.21.0
torch 2.5.1
torchaudio 2.5.1
torchvision 0.20.1
tqdm 4.67.1
transformers 4.49.0
triton 3.1.0
trl 0.9.3
typeguard 4.4.2
typepy 1.3.4
typer 0.9.4
typing_extensions 4.12.2
tyro 0.9.16
tzdata 2025.1
urllib3 2.3.0
uvicorn 0.34.0
uvloop 0.21.0
virtualenv 20.29.3
vllm 0.7.3
wandb 0.19.8
wasabi 1.1.3
watchfiles 1.0.4
wcwidth 0.2.13
weasel 0.3.4
websockets 15.0.1
wrapt 1.17.2
xformers 0.0.28.post3

@Cppowboy
Copy link

Cppowboy commented Mar 12, 2025

I think this maybe a problem of datasets library or the lcb dataset. I solve this problem by hacking the code in src/lighteval/tasks/extended/lcb/main.py:118

# configs = get_dataset_config_names("livecodebench/code_generation_lite", trust_remote_code=True)
configs = ['release_v1', 'release_v2', 'release_v3', 'release_v4', 'release_v5', 'release_latest', 'v1', 'v2', 'v3', 'v4', 'v5', 'v1_v2', 'v1_v3', 'v1_v4', 'v1_v5', 'v2_v3', 'v2_v4', 'v2_v5', 'v3_v4', 'v3_v5', 'v4_v5']

It can make the evaluation script work but I really do not think this is a good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants