[Bug] Take too much time on MATH-500 dataset evaluation #1895

msz12345 · 2025-02-26T07:57:25Z

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

it can inference correctly

Reproduces the problem - code/configuration sample

python run.py --datasets math_500_gen --hf-type base --hf-path /home/maoshizhuo/2025/deepseek-Qwen-1.5B --debug --max-out-len 32768
02/25 23:53:14 - OpenCompass - INFO - Loading math_500_gen: /home/maoshizhuo/2025/opencompass/opencompass/configs/./datasets/math/math_500_gen.py
02/25 23:53:14 - OpenCompass - INFO - Loading example: /home/maoshizhuo/2025/opencompass/opencompass/configs/./summarizers/example.py
02/25 23:53:14 - OpenCompass - INFO - Current exp folder: outputs/default/20250225_235314
02/25 23:53:14 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
02/25 23:53:14 - OpenCompass - INFO - Partitioned into 1 tasks.
02/25 23:53:16 - OpenCompass - WARNING - Only use 1 GPUs for total 4 available GPUs in debug mode.
02/25 23:53:16 - OpenCompass - INFO - Task [deepseek-Qwen-1.5B_hf/math-500]
02/25 23:53:33 - OpenCompass - INFO - Try to load the data from /home/maoshizhuo/.cache/opencompass/./data/math/
02/25 23:53:33 - OpenCompass - INFO - Start inferencing [deepseek-Qwen-1.5B_hf/math-500]
11%|███████████████ | 7/63 [13:49:33<118:18:44, 7605.80s/it]

Reproduces the problem - command or script

python run.py --datasets math_500_gen --hf-type base --hf-path /home/maoshizhuo/2025/deepseek-Qwen-1.5B --debug --max-out-len 32768

Reproduces the problem - error message

need too much time to get result about 131 hours

Other information

有什么办法加速推理吗？我注意到vllm可以加速推理，但是由于其集成了量化技术，得到的精度并不准确，我希望得到准确的结果并加速。我的实验环境有4张V100-32G GPU。谢谢！

MaiziXiao · 2025-02-26T09:41:00Z

If your model is a chat model, try to use --hf-type chat, this will use chat_template of the model. On the other hand, under the hood, OC uses HF to generate, try to call the original HF generate for one example to see if it also takes that long time.

MaiziXiao self-assigned this Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Take too much time on MATH-500 dataset evaluation #1895

[Bug] Take too much time on MATH-500 dataset evaluation #1895

msz12345 commented Feb 26, 2025

MaiziXiao commented Feb 26, 2025

[Bug] Take too much time on MATH-500 dataset evaluation #1895

[Bug] Take too much time on MATH-500 dataset evaluation #1895

Comments

msz12345 commented Feb 26, 2025

Prerequisite

Type

Environment

Reproduces the problem - code/configuration sample

Reproduces the problem - command or script

Reproduces the problem - error message

Other information

MaiziXiao commented Feb 26, 2025