Error when Importing Cosmos-1.0 VAE Model Using AutoencoderKLVAE Due to .jit File Format #53

lith0613 · 2025-01-15T07:02:58Z

When executing
python cosmos1/models/diffusion/nemo/post_training/prepare_dataset.py
there is an issue with the line of code
from nemo.collections.diffusion.models.model import DiT7BConfig.
Specifically, during the initialization of DiT7BConfig with the line:

dit_config = DiT7BConfig(vae_path=tokenizer_dir)
vae = dit_config.configure_vae()

The problem is traced to the VAE initialization, which executes

    def configure_vae(self):
        """Dynamically import video tokenizer."""
        return dynamic_import(self.vae_module)(self.vae_path)

in line 183, https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/diffusion/models/model.py.
dynamic_import(self.vae_module) essentially executes the initialization of nemo.collections.diffusion.vae.diffusers_vae.AutoencoderKLVAE
self.vae_path is Cosmos-1.0-Tokenizer-CV8x8x8.
However, the open-source VAE model Cosmos-1.0-Tokenizer-CV8x8x8, https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-CV8x8x8/tree/main, only consists of parameters ending with .jit, which cannot be properly imported using self.vae = AutoencoderKL.from_pretrained(path, torch_dtype=torch.bfloat16) in line 36, https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/diffusion/vae/diffusers_vae.py. How should this issue be handled?

Besides, about nemo, I installed from source using the main branch, is there any difference between this and the environment version installed by Docker?

Looking forward to your reply

The text was updated successfully, but these errors were encountered:

zpx01 · 2025-01-15T21:08:32Z

@lith0613 the branch of NeMo inside the container nvcr.io/nvidia/nemo:cosmos.1.0 is not the same as the main branch on the NeMo repository. We are currently in the process of merging Cosmos related changes to the main branch in the NeMo repository. If you use the container I mentioned above, I believe that should resolve your issues.

lith0613 · 2025-01-16T01:47:48Z

@lith0613 the branch of NeMo inside the container nvcr.io/nvidia/nemo:cosmos.1.0 is not the same as the main branch on the NeMo repository. We are currently in the process of merging Cosmos related changes to the main branch in the NeMo repository. If you use the container I mentioned above, I believe that should resolve your issues.

Thanks ! how can I download the NeMo repository inside the container nvcr.io/nvidia/nemo:cosmos.1.0, I just want to install the NeMo repository from source or pip.

zpx01 · 2025-01-16T08:50:29Z

@lith0613 NeMo comes pre-installed in the container. You can find the NeMo code in /opt/NeMo inside the container.

lith0613 · 2025-01-16T09:33:07Z

@lith0613 NeMo comes pre-installed in the container. You can find the NeMo code in /opt/NeMo inside the container.

I am not familiar with docker operation. Can you provide a Nemo installation package corresponding to Cosmos for me to install? I think many people need this package, too. Thank you ! @zpx01

ethanhe42 · 2025-01-16T18:42:44Z

hi @lith0613 using docker is highly recommended. we haven't tested the code outside of container.

EthicalCell · 2025-01-20T18:49:03Z

I am also interested in running this outside of docker. Unfortunately, there are no good cloud GPU providers (RunPod, Vast, etc.) that will let you run your own docker commands.

lith0613 mentioned this issue Jan 15, 2025

post training error #50

Closed

sophiahhuang assigned ethanhe42 Jan 27, 2025

sophiahhuang added the enhancement New feature or request label Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when Importing Cosmos-1.0 VAE Model Using AutoencoderKLVAE Due to .jit File Format #53

Error when Importing Cosmos-1.0 VAE Model Using AutoencoderKLVAE Due to .jit File Format #53

lith0613 commented Jan 15, 2025 •

edited

Loading

zpx01 commented Jan 15, 2025

lith0613 commented Jan 16, 2025

zpx01 commented Jan 16, 2025

lith0613 commented Jan 16, 2025 •

edited

Loading

ethanhe42 commented Jan 16, 2025

EthicalCell commented Jan 20, 2025

Error when Importing Cosmos-1.0 VAE Model Using AutoencoderKLVAE Due to .jit File Format #53

Error when Importing Cosmos-1.0 VAE Model Using AutoencoderKLVAE Due to .jit File Format #53

Comments

lith0613 commented Jan 15, 2025 • edited Loading

zpx01 commented Jan 15, 2025

lith0613 commented Jan 16, 2025

zpx01 commented Jan 16, 2025

lith0613 commented Jan 16, 2025 • edited Loading

ethanhe42 commented Jan 16, 2025

EthicalCell commented Jan 20, 2025

lith0613 commented Jan 15, 2025 •

edited

Loading

lith0613 commented Jan 16, 2025 •

edited

Loading