Skip to content

Commit

Permalink
Add missing examples in docs, fix typos
Browse files Browse the repository at this point in the history
  • Loading branch information
cabralpinto committed Aug 27, 2023
1 parent 62c8461 commit 4da5ae1
Show file tree
Hide file tree
Showing 14 changed files with 35 additions and 15 deletions.
Binary file modified docs/public/images/modules/noise-schedule/constant.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/images/modules/noise-schedule/cosine.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/images/modules/noise-schedule/linear.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/images/modules/noise-schedule/sqrt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions docs/src/pages/guides/custom-modules.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ When tinkering with Diffusion Models, the time will come when you need to ventur
>
> As with all library code, this tutorial adheres to strict type checking standards. Although we recommend typing your code, you may elect to avoid writing type annotations. By skipping this step, however, you will not receive a warning if you try to mix incompatible modules, or other useful intellisense.
## Data transformation
## Data transform

In many Diffusion Model applications, the diffusion process takes place in the dataset space. If this is your case, the prebuilt `Identity` data transformation module will serve your purposes, leaving your data untouched before applying noise during training. However, a growing number of algorithms, like [Stable Diffusion](https://arxiv.org/abs/2112.10752) and [Diffusion-LM](https://arxiv.org/abs/2205.14217), project data onto a latent space before applying diffusion.
In many Diffusion Model applications, the diffusion process takes place in the dataset space. If this is your case, the prebuilt `Identity` data transform module will serve your purposes, leaving your data untouched before applying noise during training. However, a growing number of algorithms, like [Stable Diffusion](https://arxiv.org/abs/2112.10752) and [Diffusion-LM](https://arxiv.org/abs/2205.14217), project data onto a latent space before applying diffusion.

In the case of Diffusion-LM, the dataset consists of sequences of word IDs, but the diffusion process happens in the word embedding space. This means you need a way of converting sequences of word IDs into sequences of embeddings, and train the embeddings along with the Diffusion Model. In Modular Diffusion, this can be achieved by extending the `Data` base class and implement its `encode` and `decode` methods. The former projects the data into the latent space and the latter retrieves it to the dataset space. Let's take a look at how you could implement the aforementioned transformation:
In the case of Diffusion-LM, the dataset consists of sequences of word IDs, but the diffusion process happens in the word embedding space. This means you need a way of converting sequences of word IDs into sequences of embeddings, and train the embeddings along with the Diffusion Model. In Modular Diffusion, this can be achieved by extending the `Data` base class and implement its `encode` and `decode` methods. The former projects the data into the latent space and the latter retrieves it to the dataset space. Let's take a look at how you could implement the aforementioned transform:

```python
from diffusion.base import Data
Expand All @@ -40,7 +40,7 @@ class Embedding(Data):

In the `encode` method, we are transforming the input tensor `w` into an embedding tensor using the learned embedding layer. The `decode` method reverses this operation, by finding the most similar embedding in the embedding weight matrix to each vector in `x`.

Data transformations can also be useful in cases where they have no trainable parameters. For example, the `Categorical` noise module operates over one-hot vectors, which are very memory-inneficient. To mitigate this, you may store your data as a list of labels and use the `OneHot` data transformation module to transform it into one-hot vectors on a batch-by-batch basis, saving you a lot of memory. Or your data transformation can just be a frozen variational autoencoder, like in [Stable Diffusion](https://arxiv.org/abs/2112.10752). For further details, check out our [Text Generation](/modular-diffusion/guides/text-generation) and [Image Generation](/modular-diffusion/guides/image-generation) tutorials.
Data transforms can also be useful in cases where they have no trainable parameters. For example, the `Categorical` noise module operates over one-hot vectors, which are very memory-inneficient. To mitigate this, you may store your data as a list of labels and use the `OneHot` data transform module to transform it into one-hot vectors on a batch-by-batch basis, saving you a lot of memory. Or your data transform can just be a frozen variational autoencoder, like in [Stable Diffusion](https://arxiv.org/abs/2112.10752). For further details, check out our [Text Generation](/modular-diffusion/guides/text-generation) and [Image Generation](/modular-diffusion/guides/image-generation) tutorials.

## Noise schedule

Expand Down
4 changes: 2 additions & 2 deletions docs/src/pages/guides/getting-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ x, _ = zip(*MNIST("data", download=True, transform=ToTensor()))
x = torch.stack(x) * 2 - 1
```

Let's build our Diffusion Model next. Modular Diffusion provides you with the `diffusion.Model` class, which takes as parameters a **data transformation**, a **noise schedule**, a **noise type**, a **denoiser neural network**, and a **loss function**, along with other optional parameters. You can import prebuilt components for these parameters from the different modules inside Modular Diffusion or build your own. Let's take a look at a simple example which replicates the architecture introduced in [Ho et al. (2020)](https://arxiv.org/abs/2006.11239), using only prebuilt components:
Let's build our Diffusion Model next. Modular Diffusion provides you with the `diffusion.Model` class, which takes as parameters a **data transform**, a **noise schedule**, a **noise type**, a **denoiser neural network**, and a **loss function**, along with other optional parameters. You can import prebuilt components for these parameters from the different modules inside Modular Diffusion or build your own. Let's take a look at a simple example which replicates the architecture introduced in [Ho et al. (2020)](https://arxiv.org/abs/2006.11239), using only prebuilt components:

```python
import diffusion
Expand Down Expand Up @@ -111,7 +111,7 @@ x, y = zip(*MNIST(str(input), transform=ToTensor(), download=True))
x, y = torch.stack(x) * 2 - 1, torch.tensor(y) + 1
```

Once again, let's assemble our Diffusion Model. This time, we will add the labels `y` in our data transformation object and provide the number of labels to our denoiser network. Let's also add classifier-free guidance to the model, a technique introduced in [Ho et al. (2022)](https://arxiv.org/abs/2207.12598) to improve sample quality in conditional generation, at the cost of extra sample time and less sample variety.
Once again, let's assemble our Diffusion Model. This time, we will add the labels `y` in our data transform object and provide the number of labels to our denoiser network. Let's also add classifier-free guidance to the model, a technique introduced in [Ho et al. (2022)](https://arxiv.org/abs/2207.12598) to improve sample quality in conditional generation, at the cost of extra sample time and less sample variety.

```python
from diffusion.guidance import ClassifierFree
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/src/pages/modules/loss-function.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ While not a loss module, the `Batch` object is a fundamental component of Modula
### Properties

- `w` -> Initial data tensor $w$.
- `x` -> Data tensor after transformation $x_0$.
- `x` -> Data tensor after transform $x_0$.
- `y` -> Label tensor $y$.
- `t` -> Time step tensor $t$.
- `epsilon` -> Noise tensor $\epsilon$. May be `None` for certain noise types.
Expand Down
4 changes: 2 additions & 2 deletions docs/src/pages/modules/noise-schedule.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ Constant noise schedule given by $\alpha_t = k$.
```python
from diffusion.schedule import Constant

schedule = Constant(1000, 0.01)
schedule = Constant(1000, 0.995)
```

### Visualization

Applying `Gaussian` noise to an image using the `Constant` schedule with $T=1000$ and $k=0.01$ in equally spaced snapshots:
Applying `Gaussian` noise to an image using the `Constant` schedule with $T=1000$ and $k=0.995$ in equally spaced snapshots:

![Image of a dog getting noisier at a constant rate.](/modular-diffusion/images/modules/noise-schedule/constant.png)

Expand Down
26 changes: 23 additions & 3 deletions docs/src/pages/modules/noise-type.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,9 @@ noise = Gaussian(parameter="epsilon", variance="fixed")

### Visualization

Applying `Gaussian` noise to an image using the `Linear` schedule with $T=1000$, $\alpha_0=0.9999$ and $\alpha_T=0.98$ in equally spaced snapshots:
Applying `Gaussian` noise to an image using the `Cosine` schedule with $T=1000$, $s=8e-3$ and $e=2$ in equally spaced snapshots:

![Image of a dog getting noisier at a linear rate.](/modular-diffusion/images/modules/noise-schedule/linear.png)
![Image of a dog gradually turning noisy.](/modular-diffusion/images/modules/noise-type/gaussian.png)

## Uniform categorical noise

Expand All @@ -60,6 +60,10 @@ where:
- $Q_t = \alpha_t \text{I} + (1 - \alpha_t) \mathbb{1}\mathbb{1}^T$
- $\overline{Q}_{t} = \bar{\alpha}_t \text{I} + (1 - \bar{\alpha}_t) \mathbb{1}\mathbb{1}^T$

> One-hot representation
>
> The `Uniform` noise type operates on one-hot vectors. To use it, you must use the `OneHot` data transform.
### Parameters

- `k` -> Number of categories $k$.
Expand All @@ -72,6 +76,12 @@ from diffusion.noise import Uniform
noise = Uniform(k=26)
```

### Visualization

Applying `Uniform` noise to an image with $k=255$ using the `Cosine` schedule with $T=1000$, $s=8e-3$ and $e=2$ in equally spaced snapshots:

![Image of a dog gradually turning noisy.](/modular-diffusion/images/modules/noise-type/uniform.png)

## Absorbing categorical noise

Absorbing categorical noise model introduced in [Austin et al. (2021)](https://arxiv.org/abs/2107.03006).
Expand All @@ -89,6 +99,10 @@ the absorbing state $m$ and 0 elsewhere.
- $Q_t = \alpha_t \text{I} + (1 - \alpha_t) \mathbb{1}e_m^T$
- $\overline{Q}_{t} = \bar{\alpha}_t \text{I} + (1 - \bar{\alpha}_t) \mathbb{1}e_m^T$

> One-hot representation
>
> The `Absorbing` noise type operates on one-hot vectors. To use it, you must use the `OneHot` data transform.
### Parameters

- `k` -> Number of categories $k$.
Expand All @@ -99,9 +113,15 @@ the absorbing state $m$ and 0 elsewhere.
```python
from diffusion.noise import Uniform

noise = Absorbing(k=27, m=26)
noise = Absorbing(k=255, m=128)
```

### Visualization

Applying `Absorbing` noise to an image with $k=255$ and $m=128$ using the `Cosine` schedule with $T=1000$, $s=8e-3$ and $e=2$ in equally spaced snapshots:

![Image of a dog gradually turning gray.](/modular-diffusion/images/modules/noise-type/absorbing.png)

---

*If you spot any typo or technical imprecision, please submit an issue or pull request to the library's [GitHub repository](https://github.com/cabralpinto/modular-diffusion).*
6 changes: 3 additions & 3 deletions docs/src/pages/modules/probability-distribution.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ from diffusion.distribution import Normal as N

distribution = N(torch.zeros(3), torch.full((3,), 2))
x, epsilon = distribution.sample()
# x = tensor([0.0000, 0.0000, 0.0000])
# epsilon = tensor([0.0000, 0.0000, 0.0000])
# x = tensor([ 1.1053, 1.9027, -0.2554])
# epsilon = tensor([ 0.5527, 0.9514, -0.1277])
```

## Categorical distribution
Expand All @@ -60,7 +60,7 @@ from diffusion.distribution import Categorical as Cat

distribution = Cat(torch.tensor([[.1, .3, .6], [0, 0, 1]]))
x, _ = distribution.sample()
# x = tensor([1, 2])
# x = tensor([[0., 1., 0.], [0., 0., 1.]])
```

> Noise tensor
Expand Down

0 comments on commit 4da5ae1

Please sign in to comment.