diff --git a/docs/src/assets/zygote-crop.png b/docs/src/assets/zygote-crop.png new file mode 100644 index 0000000000..ddc04b3d17 Binary files /dev/null and b/docs/src/assets/zygote-crop.png differ diff --git a/docs/src/guide/models/basics.md b/docs/src/guide/models/basics.md index 1fdb052789..272966ef09 100644 --- a/docs/src/guide/models/basics.md +++ b/docs/src/guide/models/basics.md @@ -188,7 +188,7 @@ For ordinary pure functions like `(x,y) -> (x*y)`, this `∂f(x,y)/∂f` would a depends on `θ`. ```@raw html -

Zygote.jl

+

 Zygote.jl

``` Flux's [`gradient`](@ref) function by default calls a companion packages called [Zygote](https://github.com/FluxML/Zygote.jl). @@ -327,7 +327,9 @@ grad = Flux.gradient(|>, [1f0], model1)[2] This gradient is starting to be a complicated nested structure. But it works just like before: `grad.outer.inner.W` corresponds to `model1.outer.inner.W`. -###   [Flux's layers](man-layers) +```@raw html +  Flux's layers +``` Rather than define everything from scratch every time, Flux provides a library of commonly used layers. The same model could be defined: @@ -359,14 +361,14 @@ How does this `model2` differ from the `model1` we had before? Calling [`Flux.@layer Layer`](@ref Flux.@layer) will add this, and some other niceties. If what you need isn't covered by Flux's built-in layers, it's easy to write your own. -There are more details [later](man-advanced), but the steps are invariably those shown for `struct Layer` above: +There are more details [later](@ref man-advanced), but the steps are invariably those shown for `struct Layer` above: 1. Define a `struct` which will hold the parameters. 2. Make it callable, to define how it uses them to transform the input `x` 3. Define a constructor which initialises the parameters (if the default constructor doesn't do what you want). 4. Annotate with `@layer` to opt-in to pretty printing, and other enhacements. ```@raw html -

Functors.jl

+

 Functors.jl

``` To deal with such nested structures, Flux relies heavily on an associated package @@ -399,7 +401,7 @@ of the output -- it must be a number, not a vector. Adjusting the parameters to make this smaller won't lead us anywhere interesting. Instead, we should minimise some *loss function* which compares the actual output to our desired output. -Perhaps the simplest example is curve fitting. The [previous page](man-overview) fitted +Perhaps the simplest example is curve fitting. The [previous page](@ref man-overview) fitted a linear function to data. With out two-layer `model2`, we can fit a nonlinear function. For example, let us use `f(x) = 2x - x^3` evaluated at some points `x in -2:0.1:2` as the data, and adjust the parameters of `model2` from above so that its output is similar. @@ -424,6 +426,6 @@ plot(x -> 2x-x^3, -2, 2, label="truth") scatter!(x -> model2([x]), -2:0.1f0:2, label="fitted") ``` -If this general idea is unfamiliar, you may want the [tutorial on linear regression](man-linear-regression). +If this general idea is unfamiliar, you may want the [tutorial on linear regression](@ref man-linear-regression). -More detail about what exactly the function `train!` is doing, and how to use rules other than simple [`Descent`](@ref Optimisers.Descent), is what the next page in this guide is about: [training](man-training). +More detail about what exactly the function `train!` is doing, and how to use rules other than simple [`Descent`](@ref Optimisers.Descent), is what the next page in this guide is about: [training](@ref man-training). diff --git a/docs/src/reference/models/layers.md b/docs/src/reference/models/layers.md index d7e67d3e3d..ae9232f5fb 100644 --- a/docs/src/reference/models/layers.md +++ b/docs/src/reference/models/layers.md @@ -1,4 +1,4 @@ -# [Built-in Layer Types]](@id man-layers) +# [Built-in Layer Types](@id man-layers) If you started at the beginning of the guide, then you have already met the basic [`Dense`](@ref) layer, and seen [`Chain`](@ref) for combining layers.