Skip to content

Commit

Permalink
Merge pull request #279 from valeriosofi/stable_diffusion
Browse files Browse the repository at this point in the history
Add requirements in the docs for stable diffusion
  • Loading branch information
Nebuly authored Mar 21, 2023
2 parents bd38730 + c5ff3a4 commit a7a18af
Show file tree
Hide file tree
Showing 4 changed files with 45 additions and 9 deletions.
4 changes: 3 additions & 1 deletion apps/accelerate/speedster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,9 @@ In this section, we will learn about the 4 main steps needed to optimize 🤗 Hu

<details>
<summary>🧨 Hugging Face Diffusers </summary>


> :warning: In order to work properly, the diffusers optimization requires `CUDA>=12.0` and `tensorrt>=8.6.0`. For additional details, please look the docs [here](https://docs.nebuly.com/Speedster/getting_started/diffusers_getting_started/).

In this section, we will learn about the 4 main steps needed to optimize Stable Diffusion models from the Diffusers library:

1) Input your model and data
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,41 @@
# Getting started with Stable Diffusion optimization
In this section, we will learn about the 4 main steps needed to optimize Stable Diffusion models from the `Diffusers` library:

1. [Input your model and data](#1-input-model-and-data)
2. [Run the optimization](#2-run-the-optimization)
3. [Save your optimized model](#3-save-your-optimized-model)
4. [Load and run your optimized model in production](#4-load-and-run-your-optimized-model-in-production)
1. [Environment Setup](#1-input-model-and-data)
2. [Input your model and data](#2-input-model-and-data)
3. [Run the optimization](#3-run-the-optimization)
4. [Save your optimized model](#4-save-your-optimized-model)
5. [Load and run your optimized model in production](#5-load-and-run-your-optimized-model-in-production)

## 1) Input model and data
## 1) Environment Setup (GPU only)
In order to optimize a Stable Diffusion model, you have to ensure that your environment is correctly set up according to these requirements:
- `CUDA>=12.0`
- `tensorrt>=8.6.0`

From TensorRT 8.6, all the tensorrt pre-built wheels released by nvidia support only `CUDA>=12.0`. Speedster will install `tensorrt>=8.6.0` automatically in the auto-installer only if it detects CUDA>=12.0, otherwise it will install `tensorrt==8.5.3.1`. In that case, you will have to upgrade your CUDA version and then to upgarde tensorrt to 8.6.0 or above to execute this notebook.

There should be a way to run TensorRT 8.6 also with CUDA 11, but it requires installing TensorRT in a different way, you can check this issue: https://github.com/NVIDIA/TensorRT/issues/2773. Otherwise, we highly suggest to just upgrade to CUDA 12.

You can check your CUDA version with the following command:

```bash
nvidia-smi
```

If you have CUDA<12.0, you can upgrade it at this link: https://developer.nvidia.com/cuda-downloads

You can check your TensorRT version with the following command:

```bash
python -c "import tensorrt; print(tensorrt.__version__)"
```

If you have an older version, after ensuring you have `CUDA>=12.0` installed, you can upgrade your TensorRT version by running:
```
pip install -U tensorrt
```

## 2) Input model and data

!!! info
In order to optimize a model with `Speedster`, first you should input the model you want to optimize and load some sample data that will be needed to test the optimization performances (latency, throughput, accuracy loss, etc).
Expand Down Expand Up @@ -42,7 +71,7 @@ input_data = [

Now your input model and data are ready, you can move on to [Run the optimization](#2-run-the-optimization) section 🚀.

## 2) Run the optimization
## 3) Run the optimization
Once the `model` and `input_data` have been defined, everything is ready to use Speedster's `optimize_model` function to optimize your model.

The function takes the following arguments as inputs:
Expand Down Expand Up @@ -76,7 +105,7 @@ If the speedup you obtained is good enough for your application, you can move to

If you want to squeeze out even more acceleration out of the model, please see the [`optimize_model` API](../advanced_options.md#optimize_model-api) section. Consider if in your application you can trade off a little accuracy for much higher performance and use the `metric`, `metric_drop_ths` and `optimization_time` arguments accordingly.

## 3) Save your optimized model
## 4) Save your optimized model
After accelerating the model, it can be saved using the `save_model` function:

```python
Expand All @@ -87,7 +116,7 @@ save_model(optimized_model, "model_save_path")

Now you are all set to use your optimized model in production. To explore how to do it, see the [Load and run your optimized model in production](#4-load-and-run-your-optimized-model-in-production) section.

## 4) Load and run your optimized model in production
## 5) Load and run your optimized model in production
Once the optimized model has been saved, it can be loaded with the `load_model` function:
```python
from speedster import load_model
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,7 @@
"- `CompVis/stable-diffusion-v1-4`\n",
"- `runwayml/stable-diffusion-v1-5`\n",
"- `stabilityai/stable-diffusion-2-1-base`\n",
"- `stabilityai/stable-diffusion-2-1` (only on gpus with at least 22GB of Memory, if you want to try with a GPU with a lower memory, you have to uncomment `pipe.enable_attention_slicing()` in the cell below)\n",
"\n",
"Other Stable Diffusion versions from the Diffusers library should work but have never been tested. If you try a version not included among these and it works, please feel free to report it to us on [Discord](https://discord.com/invite/RbeQMu886J) so we can add it to the list of supported versions. If you try a version that does not work, you can open an issue and possibly a PR on [GitHub](https://github.com/nebuly-ai/nebullvm/issues)."
]
Expand Down Expand Up @@ -255,6 +256,7 @@
"if device == \"cuda\":\n",
" # On GPU we load by default the model in half precision, because it's faster and lighter.\n",
" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='fp16', torch_dtype=torch.float16)\n",
" # pipe.enable_attention_slicing() # Uncomment for stable-diffusion-2.1 on gpus with 16GB of memory like V100-16GB and T4\n",
"else:\n",
" pipe = StableDiffusionPipeline.from_pretrained(model_id)\n"
]
Expand Down Expand Up @@ -446,6 +448,7 @@
"source": [
"if device == \"cuda\":\n",
" pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='fp16', torch_dtype=torch.float16)\n",
" # pipe.enable_attention_slicing() # Uncomment for stable-diffusion-2.1 on gpus with 16GB of memory like V100-16GB and T4\n",
"else:\n",
" pipe = StableDiffusionPipeline.from_pretrained(model_id)\n",
"\n",
Expand Down
2 changes: 2 additions & 0 deletions notebooks/speedster/diffusers/Readme.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# **Diffusers Optimization**

> :warning: In order to work properly, the diffusers optimization requires `CUDA>=12.0` and `tensorrt>=8.6.0`. For additional details, please look the docs [here](https://docs.nebuly.com/Speedster/getting_started/diffusers_getting_started/).
This section contains all the available notebooks that show how to leverage Speedster to optimize Diffusers models.

## Notebooks:
Expand Down

0 comments on commit a7a18af

Please sign in to comment.