There are a few fairly easy ways for newcomers to substantially improve ForwardDiff, and they all revolve around writing functions forDualnumbers. This section provides brief tutorials on how to make these contributions.
If you're new GitHub, here's an outline of the workflow you should use:
Fork ForwardDiff
Make a new branch on your fork, named after whatever changes you'll be making
Apply your code changes to the branch on your fork
When you're done, submit a PR to ForwardDiff to merge your fork into ForwardDiff's master branch.
In general, new derivative implementations for Dual are automatically defined via simple symbolic rules. ForwardDiff accomplishes this by looping over the rules provided by the DiffRules package and using them to auto-generate Dual definitions. Conveniently, these auto-generated definitions are also automatically tested.
Thus, in order to add a new derivative implementation for Dual, you should define the appropriate derivative rule(s) in DiffRules, and then check that calling the function on Dual instances delivers the desired result.
Depending on your function, ForwardDiff's auto-definition mechanism might need to be expanded to support it. If this is the case, file an issue/PR so that ForwardDiff's maintainers can help you out.
Settings
This document was generated with Documenter.jl version 1.4.1 on Wednesday 29 May 2024. Using Julia version 1.10.3.
There are a few fairly easy ways for newcomers to substantially improve ForwardDiff, and they all revolve around writing functions forDualnumbers. This section provides brief tutorials on how to make these contributions.
If you're new GitHub, here's an outline of the workflow you should use:
Fork ForwardDiff
Make a new branch on your fork, named after whatever changes you'll be making
Apply your code changes to the branch on your fork
When you're done, submit a PR to ForwardDiff to merge your fork into ForwardDiff's master branch.
In general, new derivative implementations for Dual are automatically defined via simple symbolic rules. ForwardDiff accomplishes this by looping over the rules provided by the DiffRules package and using them to auto-generate Dual definitions. Conveniently, these auto-generated definitions are also automatically tested.
Thus, in order to add a new derivative implementation for Dual, you should define the appropriate derivative rule(s) in DiffRules, and then check that calling the function on Dual instances delivers the desired result.
Depending on your function, ForwardDiff's auto-definition mechanism might need to be expanded to support it. If this is the case, file an issue/PR so that ForwardDiff's maintainers can help you out.
Settings
This document was generated with Documenter.jl version 1.5.0 on Saturday 6 July 2024. Using Julia version 1.10.4.
diff --git a/dev/dev/how_it_works/index.html b/dev/dev/how_it_works/index.html
index 16482212..e7ecb3c9 100644
--- a/dev/dev/how_it_works/index.html
+++ b/dev/dev/how_it_works/index.html
@@ -1,5 +1,5 @@
-How ForwardDiff Works · ForwardDiff
ForwardDiff is an implementation of forward mode automatic differentiation (AD) in Julia. There are two key components of this implementation: the Dual type, and the API.
ForwardDiff is an implementation of forward mode automatic differentiation (AD) in Julia. There are two key components of this implementation: the Dual type, and the API.
ForwardDiff implements methods to take derivatives, gradients, Jacobians, Hessians, and higher-order derivatives of native Julia functions (or any callable object, really) using forward mode automatic differentiation (AD).
While performance can vary depending on the functions you evaluate, the algorithms implemented by ForwardDiff generally outperform non-AD algorithms in both speed and accuracy.
ForwardDiff implements methods to take derivatives, gradients, Jacobians, Hessians, and higher-order derivatives of native Julia functions (or any callable object, really) using forward mode automatic differentiation (AD).
While performance can vary depending on the functions you evaluate, the algorithms implemented by ForwardDiff generally outperform non-AD algorithms in both speed and accuracy.
ForwardDiff is a registered Julia package, so it can be installed by running:
julia> Pkg.add("ForwardDiff")
Here's a simple example showing the package in action:
julia> using ForwardDiff
julia> f(x::Vector) = sin(x[1]) + prod(x[2:end]); # returns a scalar
@@ -35,4 +35,4 @@
end
2×4 Matrix{Float64}:
0.707107 0.0 0.0 0.0
- 0.0 12.0 8.0 6.0
If you find ForwardDiff useful in your work, we kindly request that you cite our paper. The relevant BibLaTex is available in ForwardDiff's README (not included here because BibLaTex doesn't play nice with Documenter/Jekyll).
Settings
This document was generated with Documenter.jl version 1.4.1 on Wednesday 29 May 2024. Using Julia version 1.10.3.
+ 0.0 12.0 8.0 6.0
If you find ForwardDiff useful in your work, we kindly request that you cite our paper. The relevant BibLaTex is available in ForwardDiff's README (not included here because BibLaTex doesn't play nice with Documenter/Jekyll).
Settings
This document was generated with Documenter.jl version 1.5.0 on Saturday 6 July 2024. Using Julia version 1.10.4.
This document describes several techniques and features that can be used in conjunction ForwardDiff's basic API in order to fine-tune calculations and increase performance.
Let's say you want to calculate the value, gradient, and Hessian of some function f at an input x. You could execute f(x), ForwardDiff.gradient(f, x) and ForwardDiff.hessian(f, x), but that would be a horribly redundant way to accomplish this task!
In the course of calculating higher-order derivatives, ForwardDiff ends up calculating all the lower-order derivatives and primal value f(x). To retrieve these results in one fell swoop, you can utilize the DiffResults API.
All mutating ForwardDiff API methods support the DiffResults API. In other words, API methods of the form ForwardDiff.method!(out, args...) will work appropriately if isa(out, DiffResults.DiffResult).
ForwardDiff performs partial derivative evaluation on one "chunk" of the input vector at a time. Each differentiation of a chunk requires a call to the target function as well as additional memory proportional to the square of the chunk's size. Thus, a smaller chunk size makes better use of memory bandwidth at the cost of more calls to the target function, while a larger chunk size reduces calls to the target function at the cost of more memory bandwidth.
This document describes several techniques and features that can be used in conjunction ForwardDiff's basic API in order to fine-tune calculations and increase performance.
Let's say you want to calculate the value, gradient, and Hessian of some function f at an input x. You could execute f(x), ForwardDiff.gradient(f, x) and ForwardDiff.hessian(f, x), but that would be a horribly redundant way to accomplish this task!
In the course of calculating higher-order derivatives, ForwardDiff ends up calculating all the lower-order derivatives and primal value f(x). To retrieve these results in one fell swoop, you can utilize the DiffResults API.
All mutating ForwardDiff API methods support the DiffResults API. In other words, API methods of the form ForwardDiff.method!(out, args...) will work appropriately if isa(out, DiffResults.DiffResult).
ForwardDiff performs partial derivative evaluation on one "chunk" of the input vector at a time. Each differentiation of a chunk requires a call to the target function as well as additional memory proportional to the square of the chunk's size. Thus, a smaller chunk size makes better use of memory bandwidth at the cost of more calls to the target function, while a larger chunk size reduces calls to the target function at the cost of more memory bandwidth.
For example:
julia> using ForwardDiff: GradientConfig, Chunk, gradient!
# let's use a Rosenbrock function as our target function
julia> function rosenbrock(x)
@@ -93,4 +93,4 @@
0 0 0
2 1 0
Likewise, you could write a version of vector_hessian which supports functions of the form f!(y, x), or perhaps an in-place Jacobian with ForwardDiff.jacobian!.
The Dual type includes a "tag" parameter indicating the particular function call to which it belongs. This is to avoid a problem known as perturbation confusion which can arise when there are nested differentiation calls. Tags are automatically generated as part of the appropriate config object, and the tag is checked when the config is used as part of a differentiation call (derivative, gradient, etc.): an InvalidTagException will be thrown if the incorrect config object is used.
This checking can sometimes be inconvenient, and there are certain cases where you may want to disable this checking.
Warning
Disabling tag checking should only be done with caution, especially if the code itself could be used inside another differentiation call.
(preferred) Provide an extra Val{false}() argument to the differentiation function, e.g.
Return J(f) evaluated at x, assuming f is called as f(x). Multidimensional arrays are flattened in iteration order: the array J(f) has shape length(f(x)) × length(x), and its elements are J(f)[j,k] = ∂f(x)[j]/∂x[k]. When x is a vector, this means that jacobian(x->[f(x)], x) is the transpose of gradient(f, x).
This method assumes that isa(f(x), AbstractArray).
Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
Exactly like ForwardDiff.hessian!(result::AbstractArray, f, x::AbstractArray, cfg::HessianConfig), but because isa(result, DiffResult), cfg is constructed as HessianConfig(f, result, x) instead of HessianConfig(f, x).
Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
For the sake of convenience and performance, all "extra" information used by ForwardDiff's API methods is bundled up in the ForwardDiff.AbstractConfig family of types. These types allow the user to easily feed several different parameters to ForwardDiff's API methods, such as chunk size, work buffers, and perturbation seed configurations.
ForwardDiff's basic API methods will allocate these types automatically by default, but you can drastically reduce memory usage if you preallocate them yourself.
Note that for all constructors below, the chunk size N may be explicitly provided, or omitted, in which case ForwardDiff will automatically select a chunk size for you. However, it is highly recommended to specify the chunk size manually when possible (see Configuring Chunk Size).
Note also that configurations constructed for a specific function f cannot be reused to differentiate other functions (though can be reused to differentiate f at different values). To construct a configuration which can be reused to differentiate any function, you can pass nothing as the function argument. While this is more flexible, it decreases ForwardDiff's ability to catch and prevent perturbation confusion.
Return a DerivativeConfig instance based on the type of f!, and the types/shapes of the output vector y and the input vector x.
The returned DerivativeConfig instance contains all the work buffers required by ForwardDiff.derivative and ForwardDiff.derivative! when the target function takes the form f!(y, x).
If f! is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a GradientConfig instance based on the type of f and type/shape of the input vector x.
The returned GradientConfig instance contains all the work buffers required by ForwardDiff.gradient and ForwardDiff.gradient!.
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a JacobianConfig instance based on the type of f and type/shape of the input vector x.
The returned JacobianConfig instance contains all the work buffers required by ForwardDiff.jacobian and ForwardDiff.jacobian! when the target function takes the form f(x).
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a JacobianConfig instance based on the type of f!, and the types/shapes of the output vector y and the input vector x.
The returned JacobianConfig instance contains all the work buffers required by ForwardDiff.jacobian and ForwardDiff.jacobian! when the target function takes the form f!(y, x).
If f! is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a HessianConfig instance based on the type of f and type/shape of the input vector x.
The returned HessianConfig instance contains all the work buffers required by ForwardDiff.hessian and ForwardDiff.hessian!. For the latter, the buffers are configured for the case where the result argument is an AbstractArray. If it is a DiffResult, the HessianConfig should instead be constructed via ForwardDiff.HessianConfig(f, result, x, chunk).
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a HessianConfig instance based on the type of f, types/storage in result, and type/shape of the input vector x.
The returned HessianConfig instance contains all the work buffers required by ForwardDiff.hessian! for the case where the result argument is an DiffResult.
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return J(f) evaluated at x, assuming f is called as f(x). Multidimensional arrays are flattened in iteration order: the array J(f) has shape length(f(x)) × length(x), and its elements are J(f)[j,k] = ∂f(x)[j]/∂x[k]. When x is a vector, this means that jacobian(x->[f(x)], x) is the transpose of gradient(f, x).
This method assumes that isa(f(x), AbstractArray).
Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
Exactly like ForwardDiff.hessian!(result::AbstractArray, f, x::AbstractArray, cfg::HessianConfig), but because isa(result, DiffResult), cfg is constructed as HessianConfig(f, result, x) instead of HessianConfig(f, x).
Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
For the sake of convenience and performance, all "extra" information used by ForwardDiff's API methods is bundled up in the ForwardDiff.AbstractConfig family of types. These types allow the user to easily feed several different parameters to ForwardDiff's API methods, such as chunk size, work buffers, and perturbation seed configurations.
ForwardDiff's basic API methods will allocate these types automatically by default, but you can drastically reduce memory usage if you preallocate them yourself.
Note that for all constructors below, the chunk size N may be explicitly provided, or omitted, in which case ForwardDiff will automatically select a chunk size for you. However, it is highly recommended to specify the chunk size manually when possible (see Configuring Chunk Size).
Note also that configurations constructed for a specific function f cannot be reused to differentiate other functions (though can be reused to differentiate f at different values). To construct a configuration which can be reused to differentiate any function, you can pass nothing as the function argument. While this is more flexible, it decreases ForwardDiff's ability to catch and prevent perturbation confusion.
Return a DerivativeConfig instance based on the type of f!, and the types/shapes of the output vector y and the input vector x.
The returned DerivativeConfig instance contains all the work buffers required by ForwardDiff.derivative and ForwardDiff.derivative! when the target function takes the form f!(y, x).
If f! is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a GradientConfig instance based on the type of f and type/shape of the input vector x.
The returned GradientConfig instance contains all the work buffers required by ForwardDiff.gradient and ForwardDiff.gradient!.
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a JacobianConfig instance based on the type of f and type/shape of the input vector x.
The returned JacobianConfig instance contains all the work buffers required by ForwardDiff.jacobian and ForwardDiff.jacobian! when the target function takes the form f(x).
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a JacobianConfig instance based on the type of f!, and the types/shapes of the output vector y and the input vector x.
The returned JacobianConfig instance contains all the work buffers required by ForwardDiff.jacobian and ForwardDiff.jacobian! when the target function takes the form f!(y, x).
If f! is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a HessianConfig instance based on the type of f and type/shape of the input vector x.
The returned HessianConfig instance contains all the work buffers required by ForwardDiff.hessian and ForwardDiff.hessian!. For the latter, the buffers are configured for the case where the result argument is an AbstractArray. If it is a DiffResult, the HessianConfig should instead be constructed via ForwardDiff.HessianConfig(f, result, x, chunk).
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
Return a HessianConfig instance based on the type of f, types/storage in result, and type/shape of the input vector x.
The returned HessianConfig instance contains all the work buffers required by ForwardDiff.hessian! for the case where the result argument is an DiffResult.
If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83).
ForwardDiff works by injecting user code with new number types that collect derivative information at runtime. Naturally, this technique has some limitations. Here's a list of all the roadblocks we've seen users run into ("target function" here refers to the function being differentiated):
The target function can only be composed of generic Julia functions. ForwardDiff cannot propagate derivative information through non-Julia code. Thus, your function may not work if it makes calls to external, non-Julia programs, e.g. uses explicit BLAS calls instead of Ax_mul_Bx-style functions.
The target function must be unary (i.e., only accept a single argument).ForwardDiff.jacobian is an exception to this rule.
The target function must be written generically enough to accept numbers of type T<:Real as input (or arrays of these numbers). The function doesn't require a specific type signature, as long as the type signature is generic enough to avoid breaking this rule. This also means that any storage assigned used within the function must be generic as well (see this comment for an example).
The types of array inputs must be subtypes ofAbstractArray. Non-AbstractArray array-like types are not officially supported.
ForwardDiff works by injecting user code with new number types that collect derivative information at runtime. Naturally, this technique has some limitations. Here's a list of all the roadblocks we've seen users run into ("target function" here refers to the function being differentiated):
The target function can only be composed of generic Julia functions. ForwardDiff cannot propagate derivative information through non-Julia code. Thus, your function may not work if it makes calls to external, non-Julia programs, e.g. uses explicit BLAS calls instead of Ax_mul_Bx-style functions.
The target function must be unary (i.e., only accept a single argument).ForwardDiff.jacobian is an exception to this rule.
The target function must be written generically enough to accept numbers of type T<:Real as input (or arrays of these numbers). The function doesn't require a specific type signature, as long as the type signature is generic enough to avoid breaking this rule. This also means that any storage assigned used within the function must be generic as well (see this comment for an example).
The types of array inputs must be subtypes ofAbstractArray. Non-AbstractArray array-like types are not officially supported.
Each new minor release of ForwardDiff may introduce changes in the API or behavior. Here, we'll provide some examples that highlight the key differences to help you upgrade from older versions of ForwardDiff.
In order to avoid namespace conflicts with other packages, ForwardDiff's Differentiation API is no longer exported by default. Thus, you must now fully qualify the functions to reference them:
# ForwardDiff v0.1
+Upgrading from Older Versions · ForwardDiff
Each new minor release of ForwardDiff may introduce changes in the API or behavior. Here, we'll provide some examples that highlight the key differences to help you upgrade from older versions of ForwardDiff.
In order to avoid namespace conflicts with other packages, ForwardDiff's Differentiation API is no longer exported by default. Thus, you must now fully qualify the functions to reference them: