Skip to content

Commit

Permalink
Merge pull request #1 from Roblox/update-readme
Browse files Browse the repository at this point in the history
Update README with Arxiv Link
  • Loading branch information
kelpabc123 authored Nov 19, 2024
2 parents f1a6c9f + d960c83 commit c31eb61
Showing 1 changed file with 17 additions and 2 deletions.
19 changes: 17 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
# Introduction
We introduce **SmoothCache**, a straightforward acceleration technique for DiT architecture models, that's both **training-free, flexible and performant**. By leveraging layer-wise representation error, our method identifies redundancies in the diffusion process, generates a static caching scheme to reuse output featuremaps and therefore reduces the need for computationally expensive operations. This solution works across different models and modalities, can be easily dropped into existing Diffusion Transformer pipelines, can be stacked on different solvers, and requires no additional training or datasets. **SmoothCache** consistently outperforms various solvers designed to accelerate the diffusion process, while matching or surpassing the performance of existing modality-specific caching techniques.

> 🥯[[Arxiv]](https://arxiv.org/abs/2411.10510)
![Illustration of SmoothCache. When the layer representation loss obtained from the calibration pass is below some threshold α, the corresponding layer is cached and used in place of the same computation on a future timestep. The figure on the left shows how the layer representation error impacts whether certain layers are eligible for caching. The error of the attention (attn) layer is higher in earlier timesteps, so our schedule caches the later timesteps accordingly. The figure on the right shows the application of the caching schedule to the DiT-XL architecture. The output of the attn layer at time t − 1 is cached and re-used in place of computing FFN t − 2, since the corresponding error is below α. This cached output is introduced in the model using the properties of the residual connection.](assets/SmoothCache2.png)

## Quick Start
Expand All @@ -26,7 +28,7 @@ pip install SmoothCache

### Usage

We have implemented drop-in SmoothCache helper classes that easily applies to [Huggingface Diffuser DiTPipeline](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/dit), and [original DiT implementations](https://github.com/facebookresearch/DiT).
Inspired by [DeepCache](https://raw.githubusercontent.com/horseee/DeepCache), we have implemented drop-in SmoothCache helper classes that easily applies to [Huggingface Diffuser DiTPipeline](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/dit), and [original DiT implementations](https://github.com/facebookresearch/DiT).

Generally, only 3 additional lines needs to be added to the original sampler scripts:
```python
Expand Down Expand Up @@ -156,4 +158,17 @@ Note that L2C is not training free](assets/table1.png)


# License
SmoothCache is licensed under the [Apache-2.0](LICENSE) license.
SmoothCache is licensed under the [Apache-2.0](LICENSE) license.

## Bibtex
```
@misc{liu2024smoothcacheuniversalinferenceacceleration,
title={SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers},
author={Joseph Liu and Joshua Geddes and Ziyu Guo and Haomiao Jiang and Mahesh Kumar Nandwana},
year={2024},
eprint={2411.10510},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2411.10510},
}
```

0 comments on commit c31eb61

Please sign in to comment.