From 85f960c331f44b8c664290cbcb3253fc38966f77 Mon Sep 17 00:00:00 2001
From: valerio <v.sofi@nebuly.ai>
Date: Tue, 21 Mar 2023 11:45:22 +0100
Subject: [PATCH] Update docs

---
 apps/accelerate/speedster/README.md           |  5 ++-
 .../diffusers_getting_started.md              | 45 +++++++++++++++----
 ...rate_Stable_Diffusion_with_Speedster.ipynb |  3 ++
 notebooks/speedster/diffusers/Readme.md       |  2 +
 4 files changed, 46 insertions(+), 9 deletions(-)
diff --git a/apps/accelerate/speedster/README.md b/apps/accelerate/speedster/README.md
index 5a3d8bb7..fb714a71 100644
--- a/apps/accelerate/speedster/README.md
+++ b/apps/accelerate/speedster/README.md
@@ -183,7 +183,10 @@ In this section, we will learn about the 4 main steps needed to optimize 🤗 Hu
 
 <details>
 <summary>🧨 HuggingFace Diffusers </summary>
-    
+
+> :warning: In order to work properly, the diffusers optimization requires `CUDA>=12.0` and `tensorrt>=8.6.0`. For additional details, please look the docs [here](https://docs.nebuly.com/Speedster/getting_started/diffusers_getting_started/).
+
+
 In this section, we will learn about the 4 main steps needed to optimize Stable Diffusion models from the Diffusers library:
 
 1) Input your model and data
diff --git a/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md b/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md
index d2994bb5..10216c7f 100644
--- a/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md
+++ b/apps/accelerate/speedster/docs/en/docs/getting_started/diffusers_getting_started.md
@@ -1,12 +1,41 @@
 # Getting started with Stable Diffusion optimization
 In this section, we will learn about the 4 main steps needed to optimize Stable Diffusion models from the `Diffusers` library:
 
-1. [Input your model and data](#1-input-model-and-data)
-2. [Run the optimization](#2-run-the-optimization)
-3. [Save your optimized model](#3-save-your-optimized-model)
-4. [Load and run your optimized model in production](#4-load-and-run-your-optimized-model-in-production)
+1. [Environment Setup](#1-input-model-and-data)
+2. [Input your model and data](#2-input-model-and-data)
+3. [Run the optimization](#3-run-the-optimization)
+4. [Save your optimized model](#4-save-your-optimized-model)
+5. [Load and run your optimized model in production](#5-load-and-run-your-optimized-model-in-production)
 
-## 1) Input model and data
+## 1) Environment Setup (GPU only)
+In order to optimize a Stable Diffusion model, you have to ensure that your environment is correctly set up according to these requirements:
+- `CUDA>=12.0`
+- `tensorrt>=8.6.0`
+
+From TensorRT 8.6, all the tensorrt pre-built wheels released by nvidia support only `CUDA>=12.0`. Speedster will install `tensorrt>=8.6.0` automatically in the auto-installer only if it detects CUDA>=12.0, otherwise it will install `tensorrt==8.5.3.1`. In that case, you will have to upgrade your CUDA version and then to upgarde tensorrt to 8.6.0 or above to execute this notebook.
+
+There should be a way to run TensorRT 8.6 also with CUDA 11, but it requires installing TensorRT in a different way, you can check this issue: https://github.com/NVIDIA/TensorRT/issues/2773. Otherwise, we highly suggest to just upgrade to CUDA 12.
+
+You can check your CUDA version with the following command:
+
+```bash
+nvidia-smi
+```
+
+If you have CUDA<12.0, you can upgrade it at this link: https://developer.nvidia.com/cuda-downloads
+
+You can check your TensorRT version with the following command:
+
+```bash
+python -c "import tensorrt; print(tensorrt.__version__)"
+```
+
+If you have an older version, after ensuring you have `CUDA>=12.0` installed, you can upgrade your TensorRT version by running:
+```
+pip install -U tensorrt
+```
+
+## 2) Input model and data
 
 !!! info
     In order to optimize a model with `Speedster`, first you should input the model you want to optimize and load some sample data that will be needed to test the optimization performances (latency, throughput, accuracy loss, etc). 
@@ -42,7 +71,7 @@ input_data = [
 
 Now your input model and data are ready, you can move on to [Run the optimization](#2-run-the-optimization) section 🚀.
 
-## 2) Run the optimization
+## 3) Run the optimization
 Once the `model` and `input_data` have been defined, everything is ready to use Speedster's `optimize_model` function to optimize your model. 
 
 The function takes the following arguments as inputs:
@@ -76,7 +105,7 @@ If the speedup you obtained is good enough for your application, you can move to
 
 If you want to squeeze out even more acceleration out of the model, please see the [`optimize_model` API](../advanced_options.md#optimize_model-api) section. Consider if in your application you can trade off a little accuracy for much higher performance and use the `metric`, `metric_drop_ths` and `optimization_time` arguments accordingly.
 
-## 3) Save your optimized model
+## 4) Save your optimized model
 After accelerating the model, it can be saved using the `save_model` function:
 
 ```python
@@ -87,7 +116,7 @@ save_model(optimized_model, "model_save_path")
 
 Now you are all set to use your optimized model in production. To explore how to do it, see the [Load and run your optimized model in production](#4-load-and-run-your-optimized-model-in-production) section.
 
-## 4) Load and run your optimized model in production
+## 5) Load and run your optimized model in production
 Once the optimized model has been saved,  it can be loaded with the `load_model` function:
 ```python
 from speedster import load_model
diff --git a/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb b/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb
index 124b7f9b..07737648 100644
--- a/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb
+++ b/notebooks/speedster/diffusers/Accelerate_Stable_Diffusion_with_Speedster.ipynb
@@ -219,6 +219,7 @@
     "- `CompVis/stable-diffusion-v1-4`\n",
     "- `runwayml/stable-diffusion-v1-5`\n",
     "- `stabilityai/stable-diffusion-2-1-base`\n",
+    "- `stabilityai/stable-diffusion-2-1` (only on gpus with at least 22GB of Memory, if you want to try with a GPU with a lower memory, you have to uncomment `pipe.enable_attention_slicing()` in the cell below)\n",
     "\n",
     "Other Stable Diffusion versions from the Diffusers library should work but have never been tested. If you try a version not included among these and it works, please feel free to report it to us on [Discord](https://discord.com/invite/RbeQMu886J) so we can add it to the list of supported versions. If you try a version that does not work, you can open an issue and possibly a PR on [GitHub](https://github.com/nebuly-ai/nebullvm/issues)."
    ]
@@ -255,6 +256,7 @@
     "if device == \"cuda\":\n",
     "    # On GPU we load by default the model in half precision, because it's faster and lighter.\n",
     "    pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='fp16', torch_dtype=torch.float16)\n",
+    "    # pipe.enable_attention_slicing() # Uncomment for stable-diffusion-2.1 on gpus with 16GB of memory like V100-16GB and T4\n",
     "else:\n",
     "    pipe = StableDiffusionPipeline.from_pretrained(model_id)\n"
    ]
@@ -458,6 +460,7 @@
    "source": [
     "if device == \"cuda\":\n",
     "    pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='fp16', torch_dtype=torch.float16)\n",
+    "    # pipe.enable_attention_slicing() # Uncomment for stable-diffusion-2.1 on gpus with 16GB of memory like V100-16GB and T4\n",
     "else:\n",
     "    pipe = StableDiffusionPipeline.from_pretrained(model_id)\n",
     "\n",
diff --git a/notebooks/speedster/diffusers/Readme.md b/notebooks/speedster/diffusers/Readme.md
index a8c6fe72..335a1b29 100644
--- a/notebooks/speedster/diffusers/Readme.md
+++ b/notebooks/speedster/diffusers/Readme.md
@@ -1,5 +1,7 @@
 # **Diffusers Optimization**
 
+> :warning: In order to work properly, the diffusers optimization requires `CUDA>=12.0` and `tensorrt>=8.6.0`. For additional details, please look the docs [here](https://docs.nebuly.com/Speedster/getting_started/diffusers_getting_started/).
+
 This section contains all the available notebooks that show how to leverage Speedster to optimize Diffusers models.
 
 ## Notebooks: