Skip to content

Alpha-VLLM/Lumina-Image-2.0

Repository files navigation


Lumina-Image 2.0 : A Unified and Efficient Image Generative Model

Static Badge

Static Badge  Static Badge 

📰 News

  • [2024-2-5] ComfyUI now supports Lumina-Image 2.0! 🎉 Thanks to ComfyUI@ComfyUI! 🙌 Feel free to try it out! 🚀
  • [2024-1-31] We have released the latest .pth format weight file Google Drive.
  • [2024-1-25] 🚀🚀🚀 We are excited to release Lumina-Image 2.0, including:
    • 🎯 Checkpoints, Fine-Tuning and Inference code.
    • 🎯 Website & Demo are live now! Check out the Huiying and Gradio Demo!

📑 Open-source Plan

  • Inference
  • Checkpoints
  • Web Demo (Gradio)
  • Finetuning code
  • ComfyUI
  • Diffusers
  • Technical Report
  • Unified multi-image generation

🎥 Demo

Demo.mp4

🎨 Qualitative Performance

Qualitative Results

📊 Quantatitive Performance

Quantitative Results

🎮 Model Zoo

Resolution Parameter Text Encoder VAE Download URL
1024 2.6B Gemma-2-2B FLUX-VAE-16CH hugging face

💻 Finetuning Code

1. Create a conda environment and install PyTorch

conda create -n Lumina2 -y
conda activate Lumina2
conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y

2.Install dependencies

pip install -r requirements.txt

3. Install flash-attn

pip install flash-attn --no-build-isolation

4. Prepare data

You can place the links to your data files in ./configs/data.yaml. Your image-text pair training data format should adhere to the following:

{
    "image_path": "path/to/your/image",
    "prompt": "a description of the image"
}

5. Start finetuning

bash scripts/run_1024_finetune.sh

🚀 Inference Code

We support multiple solvers including Midpoint Solver, Euler Solver, and DPM Solver for inference.

Note

Both the Gradio demo and the direct inference method use the .pth format weight file, which can be downloaded from Google Drive.

Note

You can also directly download from huggingface. We have uploaded the .pth weight files, and you can simply specify the --ckpt argument as the download directory.

  • Gradio Demo
python demo.py \
    --ckpt /path/to/your/ckpt \
    --res 1024 \
    --port 12123
  • Direct Batch Inference
bash scripts/sample.sh