- [2024-2-5] ComfyUI now supports Lumina-Image 2.0! 🎉 Thanks to ComfyUI@ComfyUI! 🙌 Feel free to try it out! 🚀
- [2024-1-31] We have released the latest .pth format weight file Google Drive.
- [2024-1-25] 🚀🚀🚀 We are excited to release
Lumina-Image 2.0
, including:- 🎯 Checkpoints, Fine-Tuning and Inference code.
- 🎯 Website & Demo are live now! Check out the Huiying and Gradio Demo!
- Inference
- Checkpoints
- Web Demo (Gradio)
- Finetuning code
- ComfyUI
- Diffusers
- Technical Report
- Unified multi-image generation
Demo.mp4
Resolution | Parameter | Text Encoder | VAE | Download URL |
---|---|---|---|---|
1024 | 2.6B | Gemma-2-2B | FLUX-VAE-16CH | hugging face |
conda create -n Lumina2 -y
conda activate Lumina2
conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
You can place the links to your data files in ./configs/data.yaml
. Your image-text pair training data format should adhere to the following:
{
"image_path": "path/to/your/image",
"prompt": "a description of the image"
}
bash scripts/run_1024_finetune.sh
We support multiple solvers including Midpoint Solver, Euler Solver, and DPM Solver for inference.
Note
Both the Gradio demo and the direct inference method use the .pth format weight file, which can be downloaded from Google Drive.
Note
You can also directly download from huggingface. We have uploaded the .pth weight files, and you can simply specify the --ckpt
argument as the download directory.
- Gradio Demo
python demo.py \
--ckpt /path/to/your/ckpt \
--res 1024 \
--port 12123
- Direct Batch Inference
bash scripts/sample.sh