[Guide] Quantize your Diffusion Models with `bnb` #10012

ariG23498 · 2024-11-25T08:41:01Z

This PR adds a guide on quantization of diffusion models using bnb and diffusers

Here is a colab notebook for easy code access

stevhliu

Very nice start! 👏

I think you can combine this guide with the existing one here since there is quite a bit of overlap between the two. Here are some general tips for doing that:

Keep the introduction in the existing guide but add a few sentences that adapts it to quantizing Flux.1-dev with bitsandbytes so you can run it on hardware with less than 16GB of memory. I think most users at this point have a general idea of what quantization is (and it is also covered in the getting started), so we don't need to spend more time on what it is/why it is important. The focus is more on bitsandbytes than quantization in general.
I don't think it's necessary to have a section for showing how to use an unquantized model. Users are probably more eager to see how they can use a quantized model and getting them there as quickly as possible would be better.
Combine the 8-bit quantization section with the existing one here. You can add here about how you're quantizing both the T5EncoderModel and FluxTransformer2DModel, what the low_cpu_mem_usage and device_map (if you have more than one GPU) parameter do.
You can do the same thing with the 4-bit section. Combine it with the existing one and add a few lines explaining the parameters.
Combine the NF4 quantization section with the one here.
Lead with the visualization in the method comparison section. Most users probably aren't too interested in comparing and running all this code themselves, so it's more impactful to lead with the results first.

docs/source/en/quantization/quant_bnb.md

pcuenca

Suggested some nits. Greatly agree with @stevhliu's comments and recommendations.

docs/source/en/quantization/quant_bnb.md

pcuenca · 2024-11-28T11:38:03Z

docs/source/en/quantization/quant_bnb.md

+```python
+memory_allocated = torch.cuda.max_memory_allocated(0) / (1024 ** 3)
+print(f"GPU Memory Allocated: {memory_allocated:.2f} GB")
+```


As a reader, I'd like to know how much it was at this point.

docs/source/en/quantization/quant_bnb.md

Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Steven Liu <[email protected]>

HuggingFaceDocBuilderDev · 2024-11-29T06:38:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ariG23498 and others added 2 commits November 25, 2024 14:08

chore: initial draft

c00200f

Merge branch 'main' into aritra/qunat-blog

fb29f2e

sayakpaul requested a review from stevhliu November 25, 2024 12:59

stevhliu reviewed Nov 26, 2024

View reviewed changes

docs/source/en/quantization/quant_bnb.md Outdated Show resolved Hide resolved

docs/source/en/quantization/quant_bnb.md Outdated Show resolved Hide resolved

docs/source/en/quantization/quant_bnb.md Outdated Show resolved Hide resolved

pcuenca reviewed Nov 28, 2024

View reviewed changes

ariG23498 and others added 3 commits November 28, 2024 18:53

Apply suggestions from code review

2d7a5b7

Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Steven Liu <[email protected]>

chore: link in place

454fcec

chore: review suggestions

4a4d56f

ariG23498 requested a review from stevhliu November 29, 2024 06:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Guide] Quantize your Diffusion Models with `bnb` #10012

[Guide] Quantize your Diffusion Models with `bnb` #10012

ariG23498 commented Nov 25, 2024

stevhliu left a comment

pcuenca left a comment

pcuenca Nov 28, 2024

HuggingFaceDocBuilderDev commented Nov 29, 2024

[Guide] Quantize your Diffusion Models with bnb #10012

Are you sure you want to change the base?

[Guide] Quantize your Diffusion Models with bnb #10012

Conversation

ariG23498 commented Nov 25, 2024

stevhliu left a comment

Choose a reason for hiding this comment

pcuenca left a comment

Choose a reason for hiding this comment

pcuenca Nov 28, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Nov 29, 2024

[Guide] Quantize your Diffusion Models with `bnb` #10012

[Guide] Quantize your Diffusion Models with `bnb` #10012