Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate inference 720p video with 24G VRAM #597

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

narrowsnap
Copy link

@narrowsnap narrowsnap commented Jul 10, 2024

Add VAE encoder for reference.

Reduce inference VRAM by separate process:

  1. Run text_encoder and save text embeding.
  2. Run VAE encoder if reference_path in prompt.(optional)
  3. Run STDiT with saved text embeding and save latents.
  4. Run VAE decoder with saved latents.

README.md Outdated
Comment on lines 364 to 374
### Separate Inference 720p video with 24G VRAM
```bash
# text to video
./scripts/separate_inference.sh 4s 720p "9:16" "a beautiful waterfall"
```

```bash
# image to video
./scripts/separate_inference.sh 4s 720p "9:16" "a beautiful waterfall. {\"reference_path\": \"path2reference.png\",\"mask_strategy\": \"0\"}""
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware of your motivation, but can you add more doc to tell other users why, when and how to run inference separately so that they can feel more guided?

Copy link
Author

@narrowsnap narrowsnap Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


set_default_params "$@"

CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 --master_port=23456 scripts/separate_inference/inference_text_encoder.py configs/opensora-v1-2/inference/sample.py --aes 7 --num-frames "$num_frames" --resolution "$resolution" --aspect-ratio "$aspect_ratio" --prompt "$prompt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will by default use 2 GPUs, can you make this configurable via bash argument as well?

@tpc2233
Copy link

tpc2233 commented Aug 14, 2024

I tried you 24gb vram code @narrowsnap but im getting the issue on inference_stdit.py with caption_embs=caption_embs, caption_emb_masks=caption_emb_masks where fails with AttributeError: 'NoneType' object has no attribute 'encode' during inference. Rest of steps seams ok.

@narrowsnap
Copy link
Author

narrowsnap commented Aug 15, 2024

caption_emb_masks

Are you update the code of RFLOW?(opensora/schedulers/rf/__init__.py)

@tpc2233
Copy link

tpc2233 commented Aug 15, 2024

caption_emb_masks

Are you update the code of RFLOW?(opensora/schedulers/rf/init.py)

Yes, i even tried git clone your fork, but no luck. This one right?
https://github.com/narrowsnap/Open-Sora/blob/main/opensora/schedulers/rf/__init__.py

@narrowsnap
Copy link
Author

caption_emb_masks

Are you update the code of RFLOW?(opensora/schedulers/rf/init.py)

Yes, i even tried git clone your fork, but no luck. This one right? https://github.com/narrowsnap/Open-Sora/blob/main/opensora/schedulers/rf/__init__.py

This is wrong! You need to use feature/720p_for_16g branch.

@tpc2233
Copy link

tpc2233 commented Aug 15, 2024

This is wrong! You need to use feature/720p_for_16g branch.

Sorry that is what i meant. Yes using from there:
https://github.com/narrowsnap/Open-Sora/blob/feature/720p_for_16g/opensora/schedulers/rf/__init__.py

It gets:
TypeError: sample() got an unexpected keyword argument 'caption_embs'
scripts/separate_inference/inference_stdit.py FAILED
Resulting on
FileNotFoundError: [Errno 2] No such file or directory: './samples/samples/2024-08-15/00002/0_0_latents.pt'

Also, many thanks for the quick replies, much appreaciated

@narrowsnap
Copy link
Author

This is wrong! You need to use feature/720p_for_16g branch.

Sorry that is what i meant. Yes using from there: https://github.com/narrowsnap/Open-Sora/blob/feature/720p_for_16g/opensora/schedulers/rf/__init__.py

It gets: TypeError: sample() got an unexpected keyword argument 'caption_embs' scripts/separate_inference/inference_stdit.py FAILED Resulting on FileNotFoundError: [Errno 2] No such file or directory: './samples/samples/2024-08-15/00002/0_0_latents.pt'

Also, many thanks for the quick replies, much appreaciated

What is the command you used?

@tpc2233
Copy link

tpc2233 commented Aug 15, 2024

This is wrong! You need to use feature/720p_for_16g branch.

Sorry that is what i meant. Yes using from there: https://github.com/narrowsnap/Open-Sora/blob/feature/720p_for_16g/opensora/schedulers/rf/__init__.py
It gets: TypeError: sample() got an unexpected keyword argument 'caption_embs' scripts/separate_inference/inference_stdit.py FAILED Resulting on FileNotFoundError: [Errno 2] No such file or directory: './samples/samples/2024-08-15/00002/0_0_latents.pt'
Also, many thanks for the quick replies, much appreaciated

What is the command you used?

From root on the fork, im just running: bash ./scripts/separate_inference.sh

@narrowsnap
Copy link
Author

This is wrong! You need to use feature/720p_for_16g branch.

Sorry that is what i meant. Yes using from there: https://github.com/narrowsnap/Open-Sora/blob/feature/720p_for_16g/opensora/schedulers/rf/__init__.py
It gets: TypeError: sample() got an unexpected keyword argument 'caption_embs' scripts/separate_inference/inference_stdit.py FAILED Resulting on FileNotFoundError: [Errno 2] No such file or directory: './samples/samples/2024-08-15/00002/0_0_latents.pt'
Also, many thanks for the quick replies, much appreaciated

What is the command you used?

From root on the fork, im just running: bash ./scripts/separate_inference.sh

I can successfully run it. According to the error you showed, I suggest you check if there is caption_embs in your code.[opensora/schedulers/rf/init.py line 45]

@tpc2233
Copy link

tpc2233 commented Aug 15, 2024

I can successfully run it. According to the error you showed, I suggest you check if there is caption_embs in your code.[opensora/schedulers/rf/init.py line 45]

Made work:) solution was delete all the instalations and cond env and use only your fork to intall. After you said was working, i tried delete the rf/init and still got the same issue, so i think was some type of caching or something referecing to the original instalation. After all re-installs. Worked. many thanks for quick replies and help @narrowsnap Great work.

@Luke100000
Copy link

Hi, is it possible to further squash VRAM usage to get it running on 12GB? :)
Right now, the T5 encoder has the highest spike. Running it on CPU (only the text encoder) allows me to generate a 720p 3s video. And smaller gens would fit into 8GB just fine as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants