Skip to content

Commit

Permalink
Add a quick start example, and change some headings (#2069)
Browse files Browse the repository at this point in the history
* add quickstart page

* tidy welcome page

* adjust folders, and some headings

* move one page to first section

* tweaks

* say linear regression somewhere, just not in the title

* tweaks

* add emoji for API, re-order

* also mention function names

* activations intro

* move destructure to a new file, along with modules

* tweaks

* tweaks

* sciml link

* less negative spacing

* rm all negative spacing

* better Layer Helpers section

* move Custom Layers to Tutorials section

* fixup

* Apply 3 suggestions

Co-authored-by: Saransh Chopra <[email protected]>

* one more

Co-authored-by: Saransh Chopra <[email protected]>

Co-authored-by: Saransh Chopra <[email protected]>
  • Loading branch information
mcabbott and Saransh-cpp authored Oct 11, 2022
1 parent 1ec32c2 commit b08cb67
Show file tree
Hide file tree
Showing 17 changed files with 261 additions and 85 deletions.
42 changes: 24 additions & 18 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,37 +9,43 @@ makedocs(
sitename = "Flux",
# strict = [:cross_references,],
pages = [
"Home" => "index.md",
"Getting Started" => [
"Welcome" => "index.md",
"Quick Start" => "models/quickstart.md",
"Fitting a Line" => "models/overview.md",
"Gradients and Layers" => "models/basics.md",
],
"Building Models" => [
"Overview" => "models/overview.md",
"Basics" => "models/basics.md",
"Built-in Layers 📚" => "models/layers.md",
"Recurrence" => "models/recurrence.md",
"Layer Reference" => "models/layers.md",
"Loss Functions" => "models/losses.md",
"Regularisation" => "models/regularisation.md",
"Custom Layers" => "models/advanced.md",
"NNlib.jl" => "models/nnlib.md",
"Activation Functions" => "models/activation.md",
"Activation Functions 📚" => "models/activation.md",
"NNlib.jl 📚 (`softmax`, `conv`, ...)" => "models/nnlib.md",
],
"Handling Data" => [
"MLUtils.jl" => "data/mlutils.md",
"OneHotArrays.jl" => "data/onehot.md",
"MLUtils.jl 📚 (`DataLoader`, ...)" => "data/mlutils.md",
"OneHotArrays.jl 📚 (`onehot`, ...)" => "data/onehot.md",
],
"Training Models" => [
"Optimisers" => "training/optimisers.md",
"Training" => "training/training.md",
"Callback Helpers" => "training/callbacks.md",
"Zygote.jl" => "training/zygote.md",
"Regularisation" => "models/regularisation.md",
"Loss Functions 📚" => "models/losses.md",
"Optimisation Rules 📚" => "training/optimisers.md", # TODO move optimiser intro up to Training
"Callback Helpers 📚" => "training/callbacks.md",
"Zygote.jl 📚 (`gradient`, ...)" => "training/zygote.md",
],
"GPU Support" => "gpu.md",
"Model Tools" => [
"GPU Support" => "gpu.md",
"Saving & Loading" => "saving.md",
"Shape Inference" => "outputsize.md",
"Weight Initialisation" => "utilities.md",
"Functors.jl" => "models/functors.md",
"Shape Inference 📚" => "outputsize.md",
"Weight Initialisation 📚" => "utilities.md",
"Flat vs. Nested 📚" => "destructure.md",
"Functors.jl 📚 (`fmap`, ...)" => "models/functors.md",
],
"Performance Tips" => "performance.md",
"Flux's Ecosystem" => "ecosystem.md",
"Tutorials" => [ # TODO, maybe
"Custom Layers" => "models/advanced.md", # TODO move freezing to Training
],
],
format = Documenter.HTML(
sidebar_sitename = false,
Expand Down
2 changes: 0 additions & 2 deletions docs/src/assets/flux.css
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,6 @@ article pre {
max-width: none;
padding: 1em;
border-radius: 10px 0px 0px 10px;
margin-left: -1em;
margin-right: -2em;
}

.hljs-comment {
Expand Down
Binary file added docs/src/assets/oneminute.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 4 additions & 6 deletions docs/src/data/mlutils.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,23 @@
# Working with data using MLUtils.jl
# Working with Data, using MLUtils.jl

Flux re-exports the `DataLoader` type and utility functions for working with
data from [MLUtils](https://github.com/JuliaML/MLUtils.jl).

## DataLoader
## `DataLoader`

`DataLoader` can be used to handle iteration over mini-batches of data.
The `DataLoader` can be used to create mini-batches of data, in the format [`train!`](@ref Flux.train!) expects.

`Flux`'s website has a [dedicated tutorial](https://fluxml.ai/tutorials/2021/01/21/data-loader.html) on `DataLoader` for more information.

```@docs
MLUtils.DataLoader
```

## Utility functions for working with data
## Utility Functions

The utility functions are meant to be used while working with data;
these functions help create inputs for your models or batch your dataset.

Below is a non-exhaustive list of such utility functions.

```@docs
MLUtils.unsqueeze
MLUtils.flatten
Expand Down
69 changes: 69 additions & 0 deletions docs/src/destructure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# [Flat vs. Nested Structures](@id man-destructure)


A Flux model is a nested structure, with parameters stored within many layers. Sometimes you may want a flat representation of them, to interact with functions expecting just one vector. This is provided by `destructure`:

```julia
julia> model = Chain(Dense(2=>1, tanh), Dense(1=>1))
Chain(
Dense(2 => 1, tanh), # 3 parameters
Dense(1 => 1), # 2 parameters
) # Total: 4 arrays, 5 parameters, 276 bytes.

julia> flat, rebuild = Flux.destructure(model)
(Float32[0.863101, 1.2454957, 0.0, -1.6345707, 0.0], Restructure(Chain, ..., 5))

julia> rebuild(zeros(5)) # same structure, new parameters
Chain(
Dense(2 => 1, tanh), # 3 parameters (all zero)
Dense(1 => 1), # 2 parameters (all zero)
) # Total: 4 arrays, 5 parameters, 276 bytes.
```

Both `destructure` and the `Restructure` function can be used within gradient computations. For instance, this computes the Hessian `∂²L/∂θᵢ∂θⱼ` of some loss function, with respect to all parameters of the Flux model. The resulting matrix has off-diagonal entries, which cannot really be expressed in a nested structure:

```julia
julia> x = rand(Float32, 2, 16);

julia> grad = gradient(m -> sum(abs2, m(x)), model) # nested gradient
((layers = ((weight = Float32[10.339018 11.379145], bias = Float32[22.845667], σ = nothing), (weight = Float32[-29.565302;;], bias = Float32[-37.644184], σ = nothing)),),)

julia> function loss(v::Vector)
m = rebuild(v)
y = m(x)
sum(abs2, y)
end;

julia> gradient(loss, flat) # flat gradient, same numbers
(Float32[10.339018, 11.379145, 22.845667, -29.565302, -37.644184],)

julia> Zygote.hessian(loss, flat) # second derivative
5×5 Matrix{Float32}:
-7.13131 -5.54714 -11.1393 -12.6504 -8.13492
-5.54714 -7.11092 -11.0208 -13.9231 -9.36316
-11.1393 -11.0208 -13.7126 -27.9531 -22.741
-12.6504 -13.9231 -27.9531 18.0875 23.03
-8.13492 -9.36316 -22.741 23.03 32.0

julia> Flux.destructure(grad) # acts on non-models, too
(Float32[10.339018, 11.379145, 22.845667, -29.565302, -37.644184], Restructure(Tuple, ..., 5))
```

### All Parameters

The function `destructure` now lives in [`Optimisers.jl`](https://github.com/FluxML/Optimisers.jl).
(Be warned this package is unrelated to the `Flux.Optimisers` sub-module! The confusion is temporary.)

```@docs
Optimisers.destructure
Optimisers.trainable
Optimisers.isnumeric
```

### All Layers

Another kind of flat view of a nested model is provided by the `modules` command. This extracts a list of all layers:

```@docs
Flux.modules
```
25 changes: 15 additions & 10 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,31 @@
# Flux: The Julia Machine Learning Library

Flux is a library for machine learning geared towards high-performance production pipelines. It comes "batteries-included" with many useful tools built in, but also lets you use the full power of the Julia language where you need it. We follow a few key principles:
Flux is a library for machine learning. It comes "batteries-included" with many useful tools built in, but also lets you use the full power of the Julia language where you need it. We follow a few key principles:

* **Doing the obvious thing**. Flux has relatively few explicit APIs for features like regularisation or embeddings. Instead, writing down the mathematical form will work – and be fast.
* **Extensible by default**. Flux is written to be highly extensible and flexible while being performant. Extending Flux is as simple as using your own code as part of the model you want - it is all [high-level Julia code](https://github.com/FluxML/Flux.jl/blob/ec16a2c77dbf6ab8b92b0eecd11661be7a62feef/src/layers/recurrent.jl#L131). When in doubt, it’s well worth looking at [the source](https://github.com/FluxML/Flux.jl/). If you need something different, you can easily roll your own.
* **Performance is key**. Flux integrates with high-performance AD tools such as [Zygote.jl](https://github.com/FluxML/Zygote.jl) for generating fast code. Flux optimizes both CPU and GPU performance. Scaling workloads easily to multiple GPUs can be done with the help of Julia's [GPU tooling](https://github.com/JuliaGPU/CUDA.jl) and projects like [DaggerFlux.jl](https://github.com/DhairyaLGandhi/DaggerFlux.jl).
* **Play nicely with others**. Flux works well with Julia libraries from [data frames](https://github.com/JuliaComputing/JuliaDB.jl) and [images](https://github.com/JuliaImages/Images.jl) to [differential equation solvers](https://github.com/JuliaDiffEq/DifferentialEquations.jl), so you can easily build complex data processing pipelines that integrate Flux models.
* **Extensible by default**. Flux is written to be highly extensible and flexible while being performant. Extending Flux is as simple as using your own code as part of the model you want - it is all [high-level Julia code](https://github.com/FluxML/Flux.jl/blob/ec16a2c77dbf6ab8b92b0eecd11661be7a62feef/src/layers/recurrent.jl#L131). When in doubt, it’s well worth looking at [the source](https://github.com/FluxML/Flux.jl/tree/master/src). If you need something different, you can easily roll your own.
* **Play nicely with others**. Flux works well with Julia libraries from [images](https://github.com/JuliaImages/Images.jl) to [differential equation solvers](https://github.com/SciML/DifferentialEquations.jl), so you can easily build complex data processing pipelines that integrate Flux models.

## Installation

Download [Julia 1.6](https://julialang.org/) or later, if you haven't already. You can add Flux using Julia's package manager, by typing `] add Flux` in the Julia prompt.
Download [Julia 1.6](https://julialang.org/downloads/) or later, preferably the current stable release. You can add Flux using Julia's package manager, by typing `] add Flux` in the Julia prompt.

If you have CUDA you can also run `] add CUDA` to get GPU support; see [here](gpu.md) for more details.
This will automatically install several other packages, including [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) which supports Nvidia GPUs. To directly access some of its functionality, you may want to add `] add CUDA` too. The page on [GPU support](gpu.md) has more details.

NOTE: Flux used to have a CuArrays.jl dependency until v0.10.4, replaced by CUDA.jl in v0.11.0. If you're upgrading Flux from v0.10.4 or a lower version, you may need to remove CuArrays (run `] rm CuArrays`) before you can upgrade.
Other closely associated packages, also installed automatically, include [Zygote](https://github.com/FluxML/Zygote.jl), [Optimisers](https://github.com/FluxML/Optimisers.jl), [NNlib](https://github.com/FluxML/NNlib.jl), [Functors](https://github.com/FluxML/Functors.jl) and [MLUtils](https://github.com/JuliaML/MLUtils.jl).

## Learning Flux

There are several different ways to learn Flux. If you just want to get started writing models, the [model zoo](https://github.com/FluxML/model-zoo/) gives good starting points for many common ones. This documentation provides a reference to all of Flux's APIs, as well as a from-scratch introduction to Flux's take on models and how they work. Once you understand these docs, congratulations, you also understand [Flux's source code](https://github.com/FluxML/Flux.jl), which is intended to be concise, legible and a good reference for more advanced concepts.
The [quick start](models/quickstart.md) page trains a simple neural network.

This rest of this documentation provides a from-scratch introduction to Flux's take on models and how they work, starting with [fitting a line](models/overview.md). Once you understand these docs, congratulations, you also understand [Flux's source code](https://github.com/FluxML/Flux.jl), which is intended to be concise, legible and a good reference for more advanced concepts.

Sections with 📚 contain API listings. The same text is avalable at the Julia prompt, by typing for example `?gpu`.

If you just want to get started writing models, the [model zoo](https://github.com/FluxML/model-zoo/) gives good starting points for many common ones.

## Community

All Flux users are welcome to join our community on the [Julia forum](https://discourse.julialang.org/), or the [slack](https://discourse.julialang.org/t/announcing-a-julia-slack/4866) (channel #machine-learning). If you have questions or issues we'll try to help you out.
Everyone is welcome to join our community on the [Julia discourse forum](https://discourse.julialang.org/), or the [slack chat](https://discourse.julialang.org/t/announcing-a-julia-slack/4866) (channel #machine-learning). If you have questions or issues we'll try to help you out.

If you're interested in hacking on Flux, the [source code](https://github.com/FluxML/Flux.jl) is open and easy to understand -- it's all just the same Julia code you work with normally. You might be interested in our [intro issues](https://github.com/FluxML/Flux.jl/labels/good%20first%20issue) to get started or our [contributing guide](https://github.com/FluxML/Flux.jl/blob/master/CONTRIBUTING.md).
If you're interested in hacking on Flux, the [source code](https://github.com/FluxML/Flux.jl) is open and easy to understand -- it's all just the same Julia code you work with normally. You might be interested in our [intro issues](https://github.com/FluxML/Flux.jl/labels/good%20first%20issue) to get started, or our [contributing guide](https://github.com/FluxML/Flux.jl/blob/master/CONTRIBUTING.md).
29 changes: 27 additions & 2 deletions docs/src/models/activation.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ These non-linearities used between layers of your model are exported by the [NNl

Note that, unless otherwise stated, activation functions operate on scalars. To apply them to an array you can call `σ.(xs)`, `relu.(xs)` and so on. Alternatively, they can be passed to a layer like `Dense(784 => 1024, relu)` which will handle this broadcasting.

Functions like [`softmax`](@ref) are sometimes described as activation functions, but not by Flux. They must see all the outputs, and hence cannot be broadcasted. See the next page for details.

### Alphabetical Listing

```@docs
celu
elu
Expand Down Expand Up @@ -32,8 +36,29 @@ tanhshrink
trelu
```

Julia's `Base.Math` also provide `tanh`, which can be used as an activation function:
### One More

Julia's `Base.Math` also provides `tanh`, which can be used as an activation function.

Note that many Flux layers will automatically replace this with [`NNlib.tanh_fast`](@ref) when called, as Base's `tanh` is slow enough to sometimes be a bottleneck.

```julia
julia> using UnicodePlots

julia> lineplot(tanh, -3, 3, height=7)
┌────────────────────────────────────────┐
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⣀⠤⠔⠒⠒⠉⠉⠉⠉⠉⠉⠉⠉⠉│ tanh(x)
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⡠⠖⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⡰⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
f(x) │⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⡤⡯⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤│
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠎⠁⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠴⠊⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
-1 │⣀⣀⣀⣀⣀⣀⣀⣀⣀⡤⠤⠔⠒⠉⠁⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
└────────────────────────────────────────┘
-3⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀3⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀x⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
```

```@docs
tanh
```
```
25 changes: 19 additions & 6 deletions docs/src/models/basics.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Flux Basics
# [How Flux Works: Gradients and Layers](@id man-basics)

## Taking Gradients

Expand Down Expand Up @@ -211,14 +211,27 @@ m = Chain(x -> x^2, x -> x+1)
m(5) # => 26
```

## Layer helpers
## Layer Helpers

Flux provides a set of helpers for custom layers, which you can enable by calling
There is still one problem with this `Affine` layer, that Flux does not know to look inside it. This means that [`Flux.train!`](@ref) won't see its parameters, nor will [`gpu`](@ref) be able to move them to your GPU. These features are enabled by the `@functor` macro:

```julia
```
Flux.@functor Affine
```

This enables a useful extra set of functionality for our `Affine` layer, such as [collecting its parameters](../training/optimisers.md) or [moving it to the GPU](../gpu.md).
Finally, most Flux layers make bias optional, and allow you to supply the function used for generating random weights. We can easily add these refinements to the `Affine` layer as follows:

```
function Affine((in, out)::Pair; bias=true, init=Flux.randn32)
W = init(out, in)
b = Flux.create_bias(W, bias, out)
Affine(W, b)
end
Affine(3 => 1, bias=false, init=ones) |> gpu
```

For some more helpful tricks, including parameter freezing, please checkout the [advanced usage guide](advanced.md).
```@docs
Functors.@functor
Flux.create_bias
```
2 changes: 1 addition & 1 deletion docs/src/models/functors.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Flux models are deeply nested structures, and [Functors.jl](https://github.com/F

New layers should be annotated using the `Functors.@functor` macro. This will enable [`params`](@ref Flux.params) to see the parameters inside, and [`gpu`](@ref) to move them to the GPU.

`Functors.jl` has its own [notes on basic usage](https://fluxml.ai/Functors.jl/stable/#Basic-Usage-and-Implementation) for more details. Additionally, the [Advanced Model Building and Customisation](@ref Advanced-Model-Building-and-Customisation) page covers the use cases of `Functors` in greater details.
`Functors.jl` has its own [notes on basic usage](https://fluxml.ai/Functors.jl/stable/#Basic-Usage-and-Implementation) for more details. Additionally, the [Advanced Model Building and Customisation](../models/advanced.md) page covers the use cases of `Functors` in greater details.

```@docs
Functors.@functor
Expand Down
9 changes: 0 additions & 9 deletions docs/src/models/layers.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,3 @@ Many normalisation layers behave differently under training and inference (testi
Flux.testmode!
trainmode!
```


## Listing All Layers

The `modules` command uses Functors to extract a flat list of all layers:

```@docs
Flux.modules
```
2 changes: 1 addition & 1 deletion docs/src/models/losses.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Loss Functions
# [Loss Functions](@id man-losses)

Flux provides a large number of common loss functions used for training machine learning models.
They are grouped together in the `Flux.Losses` module.
Expand Down
Loading

0 comments on commit b08cb67

Please sign in to comment.