Colab L4 24GB VRAM - OOM #539

sdes21 · 2024-11-22T01:06:23Z

System Info / 系統信息

Google Colab L4 - 24gb VRAM

Information / 问题信息

The official example scripts / 官方的示例脚本
My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

Updated to use the latest diffusers.

Using float16.

transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="transformer", torch_dtype=torch.float16)
text_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="text_encoder", torch_dtype=torch.float16)
vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="vae", torch_dtype=torch.float16)

OutOfMemoryError Traceback (most recent call last)
in <cell line: 1>()
----> 1 video = pipe(
2 prompt=prompt,
3 image=image,
4 num_videos_per_prompt=1,
5 num_inference_steps=40,

17 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
718 self.groups,
719 )
--> 720 return F.conv3d(
721 input, weight, bias, self.stride, self.padding, self.dilation, self.groups
722 )

OutOfMemoryError: CUDA out of memory. Tried to allocate 1.38 GiB. GPU 0 has a total capacity of 22.17 GiB of which 842.88 MiB is free. Process 24787 has 21.34 GiB memory in use. Of the allocated memory 21.08 GiB is allocated by PyTorch, and 29.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Expected behavior / 期待表现

Followed the step but always OOM.

Had the pipe.enable_sequential_cpu_offload() enabled.

sdes21 · 2024-11-22T01:29:55Z

This is my colab code -

!pip install git+https://github.com/huggingface/diffusers
!pip install --upgrade transformers hf_transfer accelerate diffusers imageio-ffmpeg
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

import torch
from diffusers import AutoencoderKLCogVideoX, CogVideoXImageToVideoPipeline, CogVideoXTransformer3DModel
from diffusers.utils import export_to_video, load_image
from transformers import T5EncoderModel

transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="transformer", torch_dtype=torch.float16)
text_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="text_encoder", torch_dtype=torch.float16)
vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="vae", torch_dtype=torch.float16)

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
"THUDM/CogVideoX1.5-5B-I2V",
text_encoder=text_encoder,
transformer=transformer,
vae=vae,
torch_dtype=torch.float16,
)

pipe.enable_model_cpu_offload()

prompt = "A cat sitting on a couch playing guitar. High quality, ultrarealistic detail and breath-taking movie-like camera shot."
image = load_image("/content/a.png")

video = pipe(
prompt=prompt,
image=image,
num_videos_per_prompt=1,
num_inference_steps=40,
num_frames=81,
guidance_scale=6,
generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

zRzRzRzRzRzRzR · 2024-11-22T03:03:46Z

Try using our cli_demo's loading method, otherwise the peak video memory will exceed 24G.

zRzRzRzRzRzRzR self-assigned this Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Colab L4 24GB VRAM - OOM #539

Colab L4 24GB VRAM - OOM #539

sdes21 commented Nov 22, 2024

sdes21 commented Nov 22, 2024

zRzRzRzRzRzRzR commented Nov 22, 2024

Colab L4 24GB VRAM - OOM #539

Colab L4 24GB VRAM - OOM #539

Comments

sdes21 commented Nov 22, 2024

System Info / 系統信息

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

sdes21 commented Nov 22, 2024

zRzRzRzRzRzRzR commented Nov 22, 2024