[OOM] CUDA out of memory error during evaluating llava-hf and llama-vision models #500

commmet-ahn · 2025-01-14T07:03:34Z

I attempted evaluations using the llava-hf and llama-vision models but encountered out-of-memory (OOM) errors.

For the llava-hf model, the OOM error occurred when using the VQAv2 dataset, while it did not occur when using the MMMU dataset.

For the llama-vision model, OOM errors occurred with all datasets, even when utilizing nine 40GB GPUs. Applying device_map=auto did not resolve the issue.

Is there a way to perform evaluations without encountering OOM errors?

Below is the shell script I used for reference.

python3 -m accelerate.commands.launch \ --num_processes=9 \ -m lmms_eval \ --model llava_hf \ --model_args pretrained="llava-hf/llava-v1.6-vicuna-7b-hf" \ --tasks vqav2 \ --batch_size 1 \ --log_samples \ --log_samples_suffix llava_v1.5_vqav2 \ --output_path ./logs/

python3 -m accelerate.commands.launch \ --num_processes=9 \ -m lmms_eval \ --model llama_vision \ --model_args pretrained="meta-llama/Llama-3.2-11B-Vision-Instruct",device_map=auto \ --tasks mmmu \ --batch_size 1 \ --log_samples \ --log_samples_suffix llama_11b_mmmu \ --output_path ./logs/ \

The text was updated successfully, but these errors were encountered:

pufanyi · 2025-01-17T02:34:54Z

Hiii! You may try to start the task directly using lmms-eval, because if you haven't run accelerate config, accelerate will, by default, load one model on each GPU:

lmms-eval \
    --model llama_vision \
    --model_args pretrained="meta-llama/Llama-3.2-11B-Vision-Instruct",device_map=auto \
    --tasks mmmu \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix llama_11b_mmmu \
     --output_path ./logs/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OOM] CUDA out of memory error during evaluating llava-hf and llama-vision models #500

[OOM] CUDA out of memory error during evaluating llava-hf and llama-vision models #500

commmet-ahn commented Jan 14, 2025

pufanyi commented Jan 17, 2025

[OOM] CUDA out of memory error during evaluating llava-hf and llama-vision models #500

[OOM] CUDA out of memory error during evaluating llava-hf and llama-vision models #500

Comments

commmet-ahn commented Jan 14, 2025

pufanyi commented Jan 17, 2025