From 85f960c331f44b8c664290cbcb3253fc38966f77 Mon Sep 17 00:00:00 2001 From: valerio Date: Tue, 21 Mar 2023 11:45:22 +0100 Subject: [PATCH] Update docs --- apps/accelerate/speedster/README.md | 5 ++- .../diffusers_getting_started.md | 45 +++++++++++++++---- ...rate_Stable_Diffusion_with_Speedster.ipynb | 3 ++ notebooks/speedster/diffusers/Readme.md | 2 + 4 files changed, 46 insertions(+), 9 deletions(-) diff --git a/apps/accelerate/speedster/README.md b/apps/accelerate/speedster/README.md index 5a3d8bb7..fb714a71 100644 --- a/apps/accelerate/speedster/README.md +++ b/apps/accelerate/speedster/README.md @@ -183,7 +183,10 @@ In this section, we will learn about the 4 main steps needed to optimize 🤗 Hu
🧨 HuggingFace Diffusers - + +> :warning: In order to work properly, the diffusers optimization requires `CUDA>=12.0` and `tensorrt>=8.6.0`. For additional details, please look the docs [here](https://docs.nebuly.com/Speedster/getting_started/diffusers_getting_started/). + + In this section, we will learn about the 4 main steps needed to optimize Stable Diffusion models from the Diffusers library: 1) Input your model and data diff --git a/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md b/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md index d2994bb5..10216c7f 100644 --- a/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md +++ b/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md @@ -1,12 +1,41 @@ # Getting started with Stable Diffusion optimization In this section, we will learn about the 4 main steps needed to optimize Stable Diffusion models from the `Diffusers` library: -1. [Input your model and data](#1-input-model-and-data) -2. [Run the optimization](#2-run-the-optimization) -3. [Save your optimized model](#3-save-your-optimized-model) -4. [Load and run your optimized model in production](#4-load-and-run-your-optimized-model-in-production) +1. [Environment Setup](#1-input-model-and-data) +2. [Input your model and data](#2-input-model-and-data) +3. [Run the optimization](#3-run-the-optimization) +4. [Save your optimized model](#4-save-your-optimized-model) +5. [Load and run your optimized model in production](#5-load-and-run-your-optimized-model-in-production) -## 1) Input model and data +## 1) Environment Setup (GPU only) +In order to optimize a Stable Diffusion model, you have to ensure that your environment is correctly set up according to these requirements: +- `CUDA>=12.0` +- `tensorrt>=8.6.0` + +From TensorRT 8.6, all the tensorrt pre-built wheels released by nvidia support only `CUDA>=12.0`. Speedster will install `tensorrt>=8.6.0` automatically in the auto-installer only if it detects CUDA>=12.0, otherwise it will install `tensorrt==8.5.3.1`. In that case, you will have to upgrade your CUDA version and then to upgarde tensorrt to 8.6.0 or above to execute this notebook. + +There should be a way to run TensorRT 8.6 also with CUDA 11, but it requires installing TensorRT in a different way, you can check this issue: https://github.com/NVIDIA/TensorRT/issues/2773. Otherwise, we highly suggest to just upgrade to CUDA 12. + +You can check your CUDA version with the following command: + +```bash +nvidia-smi +``` + +If you have CUDA<12.0, you can upgrade it at this link: https://developer.nvidia.com/cuda-downloads + +You can check your TensorRT version with the following command: + +```bash +python -c "import tensorrt; print(tensorrt.__version__)" +``` + +If you have an older version, after ensuring you have `CUDA>=12.0` installed, you can upgrade your TensorRT version by running: +``` +pip install -U tensorrt +``` + +## 2) Input model and data !!! info In order to optimize a model with `Speedster`, first you should input the model you want to optimize and load some sample data that will be needed to test the optimization performances (latency, throughput, accuracy loss, etc). @@ -42,7 +71,7 @@ input_data = [ Now your input model and data are ready, you can move on to [Run the optimization](#2-run-the-optimization) section 🚀. -## 2) Run the optimization +## 3) Run the optimization Once the `model` and `input_data` have been defined, everything is ready to use Speedster's `optimize_model` function to optimize your model. The function takes the following arguments as inputs: @@ -76,7 +105,7 @@ If the speedup you obtained is good enough for your application, you can move to If you want to squeeze out even more acceleration out of the model, please see the [`optimize_model` API](../advanced_options.md#optimize_model-api) section. Consider if in your application you can trade off a little accuracy for much higher performance and use the `metric`, `metric_drop_ths` and `optimization_time` arguments accordingly. -## 3) Save your optimized model +## 4) Save your optimized model After accelerating the model, it can be saved using the `save_model` function: ```python @@ -87,7 +116,7 @@ save_model(optimized_model, "model_save_path") Now you are all set to use your optimized model in production. To explore how to do it, see the [Load and run your optimized model in production](#4-load-and-run-your-optimized-model-in-production) section. -## 4) Load and run your optimized model in production +## 5) Load and run your optimized model in production Once the optimized model has been saved, it can be loaded with the `load_model` function: ```python from speedster import load_model diff --git a/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb b/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb index 124b7f9b..07737648 100644 --- a/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb +++ b/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb @@ -219,6 +219,7 @@ "- `CompVis/stable-diffusion-v1-4`\n", "- `runwayml/stable-diffusion-v1-5`\n", "- `stabilityai/stable-diffusion-2-1-base`\n", + "- `stabilityai/stable-diffusion-2-1` (only on gpus with at least 22GB of Memory, if you want to try with a GPU with a lower memory, you have to uncomment `pipe.enable_attention_slicing()` in the cell below)\n", "\n", "Other Stable Diffusion versions from the Diffusers library should work but have never been tested. If you try a version not included among these and it works, please feel free to report it to us on [Discord](https://discord.com/invite/RbeQMu886J) so we can add it to the list of supported versions. If you try a version that does not work, you can open an issue and possibly a PR on [GitHub](https://github.com/nebuly-ai/nebullvm/issues)." ] @@ -255,6 +256,7 @@ "if device == \"cuda\":\n", " # On GPU we load by default the model in half precision, because it's faster and lighter.\n", " pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='fp16', torch_dtype=torch.float16)\n", + " # pipe.enable_attention_slicing() # Uncomment for stable-diffusion-2.1 on gpus with 16GB of memory like V100-16GB and T4\n", "else:\n", " pipe = StableDiffusionPipeline.from_pretrained(model_id)\n" ] @@ -458,6 +460,7 @@ "source": [ "if device == \"cuda\":\n", " pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='fp16', torch_dtype=torch.float16)\n", + " # pipe.enable_attention_slicing() # Uncomment for stable-diffusion-2.1 on gpus with 16GB of memory like V100-16GB and T4\n", "else:\n", " pipe = StableDiffusionPipeline.from_pretrained(model_id)\n", "\n", diff --git a/notebooks/speedster/diffusers/Readme.md b/notebooks/speedster/diffusers/Readme.md index a8c6fe72..335a1b29 100644 --- a/notebooks/speedster/diffusers/Readme.md +++ b/notebooks/speedster/diffusers/Readme.md @@ -1,5 +1,7 @@ # **Diffusers Optimization** +> :warning: In order to work properly, the diffusers optimization requires `CUDA>=12.0` and `tensorrt>=8.6.0`. For additional details, please look the docs [here](https://docs.nebuly.com/Speedster/getting_started/diffusers_getting_started/). + This section contains all the available notebooks that show how to leverage Speedster to optimize Diffusers models. ## Notebooks: