-
-
Notifications
You must be signed in to change notification settings - Fork 612
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* autosize, take 1 * fix outputsize on LayerNorm * tidy & improve * add tests, release note * rrule errors, improvements, tests * documentation * tweaks * add jldoctest; output = false * tweak * using Flux
- Loading branch information
Showing
5 changed files
with
318 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,47 +1,79 @@ | ||
# Shape Inference | ||
|
||
To help you generate models in an automated fashion, [`Flux.outputsize`](@ref) lets you | ||
calculate the size returned produced by layers for a given size input. | ||
This is especially useful for layers like [`Conv`](@ref). | ||
Flux has some tools to help generate models in an automated fashion, by inferring the size | ||
of arrays that layers will recieve, without doing any computation. | ||
This is especially useful for convolutional models, where the same [`Conv`](@ref) layer | ||
accepts any size of image, but the next layer may not. | ||
|
||
It works by passing a "dummy" array into the model that preserves size information without running any computation. | ||
`outputsize(f, inputsize)` works for all layers (including custom layers) out of the box. | ||
By default, `inputsize` expects the batch dimension, | ||
but you can exclude the batch size with `outputsize(f, inputsize; padbatch=true)` (assuming it to be one). | ||
The higher-level tool is a macro [`@autosize`](@ref) which acts on the code defining the layers, | ||
and replaces each appearance of `_` with the relevant size. This simple example returns a model | ||
with `Dense(845 => 10)` as the last layer: | ||
|
||
Using this utility function lets you automate model building for various inputs like so: | ||
```julia | ||
""" | ||
make_model(width, height, inchannels, nclasses; | ||
layer_config = [16, 16, 32, 32, 64, 64]) | ||
@autosize (28, 28, 1, 32) Chain(Conv((3, 3), _ => 5, relu, stride=2), Flux.flatten, Dense(_ => 10)) | ||
``` | ||
|
||
The input size may be provided at runtime, like `@autosize (sz..., 1, 32) Chain(Conv(`..., but all the | ||
layer constructors containing `_` must be explicitly written out -- the macro sees the code as written. | ||
|
||
This macro relies on a lower-level function [`outputsize`](@ref Flux.outputsize), which you can also use directly: | ||
|
||
```julia | ||
c = Conv((3, 3), 1 => 5, relu, stride=2) | ||
Flux.outputsize(c, (28, 28, 1, 32)) # returns (13, 13, 5, 32) | ||
``` | ||
|
||
Create a CNN for a given set of configuration parameters. | ||
The function `outputsize` works by passing a "dummy" array into the model, which propagates through very cheaply. | ||
It should work for all layers, including custom layers, out of the box. | ||
|
||
# Arguments | ||
- `width`: the input image width | ||
- `height`: the input image height | ||
- `inchannels`: the number of channels in the input image | ||
- `nclasses`: the number of output classes | ||
- `layer_config`: a vector of the number of filters per each conv layer | ||
An example of how to automate model building is this: | ||
```jldoctest; output = false, setup = :(using Flux) | ||
""" | ||
function make_model(width, height, inchannels, nclasses; | ||
layer_config = [16, 16, 32, 32, 64, 64]) | ||
# construct a vector of conv layers programmatically | ||
conv_layers = [Conv((3, 3), inchannels => layer_config[1])] | ||
for (infilters, outfilters) in zip(layer_config, layer_config[2:end]) | ||
push!(conv_layers, Conv((3, 3), infilters => outfilters)) | ||
make_model(width, height, [inchannels, nclasses; layer_config]) | ||
Create a CNN for a given set of configuration parameters. Arguments: | ||
- `width`, `height`: the input image size in pixels | ||
- `inchannels`: the number of channels in the input image, default `1` | ||
- `nclasses`: the number of output classes, default `10` | ||
- Keyword `layer_config`: a vector of the number of channels per layer, default `[16, 16, 32, 64]` | ||
""" | ||
function make_model(width, height, inchannels = 1, nclasses = 10; | ||
layer_config = [16, 16, 32, 64]) | ||
# construct a vector of layers: | ||
conv_layers = [] | ||
push!(conv_layers, Conv((5, 5), inchannels => layer_config[1], relu, pad=SamePad())) | ||
for (inch, outch) in zip(layer_config, layer_config[2:end]) | ||
push!(conv_layers, Conv((3, 3), inch => outch, sigmoid, stride=2)) | ||
end | ||
# compute the output dimensions for the conv layers | ||
# use padbatch=true to set the batch dimension to 1 | ||
conv_outsize = Flux.outputsize(conv_layers, (width, height, nchannels); padbatch=true) | ||
# compute the output dimensions after these conv layers: | ||
conv_outsize = Flux.outputsize(conv_layers, (width, height, inchannels); padbatch=true) | ||
# the input dimension to Dense is programatically calculated from | ||
# width, height, and nchannels | ||
return Chain(conv_layers..., Dense(prod(conv_outsize) => nclasses)) | ||
# use this to define appropriate Dense layer: | ||
last_layer = Dense(prod(conv_outsize) => nclasses) | ||
return Chain(conv_layers..., Flux.flatten, last_layer) | ||
end | ||
m = make_model(28, 28, 3, layer_config = [9, 17, 33, 65]) | ||
Flux.outputsize(m, (28, 28, 3, 42)) == (10, 42) == size(m(randn(Float32, 28, 28, 3, 42))) | ||
# output | ||
true | ||
``` | ||
|
||
Alternatively, using the macro, the definition of `make_model` could end with: | ||
|
||
``` | ||
# compute the output dimensions & construct appropriate Dense layer: | ||
return @autosize (width, height, inchannels, 1) Chain(conv_layers..., Flux.flatten, Dense(_ => nclasses)) | ||
end | ||
``` | ||
|
||
### Listing | ||
|
||
```@docs | ||
Flux.@autosize | ||
Flux.outputsize | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.