updating docs

byuflowlab · Jun 28, 2023 · 808dc3c · 808dc3c
1 parent d970a7e
commit 808dc3c
Show file tree

Hide file tree

Showing 3 changed files with 157 additions and 63 deletions.
diff --git a/README.md b/README.md
@@ -4,44 +4,32 @@
 [![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://byuflowlab.github.io/ImplicitAD.jl/dev/)
 [![Build Status](https://github.com/byuflowlab/ImplicitAD.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/byuflowlab/ImplicitAD.jl/actions/workflows/CI.yml?query=branch%3Amain)
 
-**Summary**: Automate adjoints: make implicit functions compatible with algorithmic differentiation without differenting inside the solvers. Also allow for custom rules with explicit functions (e.g., calling external code, mixed mode AD).
+**Summary**: Automate steady and unsteady adjoints.
+
+Make implicit functions compatible with algorithmic differentiation (AD) without differentiating inside the solvers (discrete adjoint). Even though one can sometimes propagate AD through a solver, this is typically inefficient and less accurate.  Instead, one should use adjoints or direct (forward) methods. However, implementing adjoints is often cumbersome. This package allows for a one-line change to automate this process.  End-users can then use your package with AD normally, and utilize adjoints automatically.
+
+We've also enabled methods to efficiently compute derivatives through explicit and implicit ODE solvers (unsteady discrete adjoint).  For the implicit solve at each time step we can apply the same methodology.  However, both still face memory challenges for long time-based simulations.  We analytically propagate derivatives between time steps so that reverse mode AD tapes only need to extend across a single time step. This allows for arbitrarily long time sequences without increasing memory requirements.
+
+As a side benefit the above functionality easily allows one to define custom AD rules.  This is perhaps most useful when calling code from another language.  We provide fall backs for utilizing finite differencing and complex step efficiently if the external code cannot provide derivatives (ideally via Jacobian vector products).  This functionality can also be used for mixed-mode AD.
 
 **Author**: Andrew Ning and Taylor McDonnell
 
 **Features**:
 
-- Compatible with ForwardDiff and ReverseDiff
+- Compatible with ForwardDiff and ReverseDiff (or any ChainRules compliant reverse mode AD package)
 - Compatible with any solver (no differentiation occurs inside the solver)
 - Simple drop in functionality
-- Customizable subfunctions to accommodate different use cases like iterative linear solvers, custom Jacobian vector products, etc.
+- Customizable subfunctions to accommodate different use cases (e.g., custom linear solvers, factorizations, matrix-free operators)
 - Version for ordinary differentiation equations (i.e., discrete unsteady adjoint)
-- Analytic overrides for linear systems
-- Analytic overrides for eigenvalue problems
+- Analytic overrides for linear systems (more efficient)
+- Analytic overrides for eigenvalue problems (more efficient)
 - Can provide custom rules to be inserted into the AD chain. Provides finite differencing and complex step defaults for cases where AD is not available (e.g., calling another language).
 
-**Implicit Motivation**:
-
-Many engineering analyses use implicit functions.  We can represent any such implicit function generally as:
-```math
-r(y; x) = 0
-```
-where ``r`` are the residual functions we wish to drive to zero, ``x`` are inputs, and ``y`` are the state variables, which are also outputs once the system of equations is solved.  In other words, ``y`` is an implicit function of ``x`` (``x -> r(y; x) -> y``).
-
-We then chose some appropriate solver to converge these residuals.  From a differentiation perspective, we would like to compute ``dy/dx``.  One can often use algorithmic differentiation (AD) in the same way one would for any explicit function.  Once we unroll the iterations of the solver the set of instructions is explicit.  However, this is at best inefficient and at worse inaccurate or not possible (at least not without a lot more effort).  To obtain accurate derivatives by propgating AD through a solver, the solver must be solved to a tight tolerance.  Generally tighter than is required to converge the primal values.  Sometimes this is not feasible because operations inside the solvers may not be overloaded for AD, this is especially true when calling solvers in other languages.  But even if we can do it (tight convergence is possible and everything under the hood is overloaded) we usually still shouldn't, as it would be computationally inefficient.  Instead we can use implicit differentiation, to allow for AD to work seemlessly with implicit functions without having to differentiate through them.
-
-This package provides an implementation so that a simple one-line change can be applied to allow AD to be propgated around any solver.  Note that the implementation of the solver need not be AD compatible since AD does not not occur inside the solver.  This package is overloaded for [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) and [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl).  There are also optional inputs so that subfunction behavior can be customized (e.g., preallocation, custom linear solvers, custom factorizations).
-
-**Custom Rule Motivation**:
-
-A different but related need is to propagate AD through functions that are not-AD compatible. A common example would be a call to a subfunction is another language that is part of a larger AD compatible function. This packages provides a simple wrapper to estimate the derivatives of the subfunction with finite differencing (forward or central) or complex step.  Those derivatives are then inserted into the AD chain so that the overall function seamlessly works with ForwardDiff or ReverseDiff.
-
-That same functionality is useful also in cases where a function is already AD compatible but where a more efficient rule is available.  We can provide the Jacobian or the Jacobian vector / vector Jacobian products directly.  One common example is mixed mode AD.  In this case we may have a subfunction that is most efficiently differentiated in reverse mode, but the overall function is differentiated in forward mode.  We can provide a custom rule for the subfunction which will then be inserted into the forward mode chain. 
-
 **Documentation**:
 
-- Start with the tutorial to learn usage.  
+- Start with the tutorial to learn usage.
 - The API is described in the reference page.
-- The math is particularly helpful for those wanting to provide their own custom subfunctions. See the theory page.
+- The math is particularly helpful for those wanting to provide their own custom subfunctions. See the theory and also some scaling examples in this [PDF](https://arxiv.org/pdf/2306.15243.pdf).  A supplementary document deriving the linear and eigenvalue cases is available in this [PDF]().
 
 **Run Unit Tests**:
 
@@ -50,6 +38,10 @@ pkg> activate .
 pkg> test
 ```
 
+**Citing**:
+
+For now, please cite the following preprint.  DOI: [10.48550/arXiv.2306.15243](https://doi.org/10.48550/arXiv.2306.15243)
+
 **Other Packages**:
 
-[Nonconvex.jl](https://julianonconvex.github.io/Nonconvex.jl/stable/gradients/implicit/) and [ImplicitDifferentiation.jl](https://github.com/gdalle/ImplicitDifferentiation.jl) are other prior implementations of the nonlinear case. [SciML](https://docs.sciml.ai/SciMLSensitivity/stable/manual/differential_equation_sensitivities/#sensitivity_diffeq) provides support for continuous unsteady adjoints of ODEs.  They have also recently added an implementation for the [nonlinear case](https://docs.sciml.ai/SciMLSensitivity/stable/manual/nonlinear_solve_sensitivities/).
+[Nonconvex.jl](https://julianonconvex.github.io/Nonconvex.jl/stable/gradients/implicit/) and [ImplicitDifferentiation.jl](https://github.com/gdalle/ImplicitDifferentiation.jl) are other prior implementations of the nonlinear portion of this package.  [SciML](https://docs.sciml.ai/SciMLSensitivity/stable/manual/differential_equation_sensitivities/#sensitivity_diffeq) provides support for continuous unsteady adjoints of ODEs.  They have also recently added an implementation for the [nonlinear case](https://docs.sciml.ai/SciMLSensitivity/stable/manual/nonlinear_solve_sensitivities/).
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -1,43 +1,31 @@
 # ImplicitAD Documentation
 
-**Summary**: Make implicit functions compatible with algorithmic differentiation without differenting inside the solvers. Also allow for custom rules with explicit functions (e.g., calling external code, mixed mode AD).
+**Summary**: Automate steady and unsteady adjoints.
+
+Make implicit functions compatible with algorithmic differentiation (AD) without differentiating inside the solvers (discrete adjoint). Even though one can sometimes propagate AD through a solver, this is typically inefficient and less accurate.  Instead, one should use adjoints or direct (forward) methods. However, implementing adjoints is often cumbersome. This package allows for a one-line change to automate this process.  End-users can then use your package with AD normally, and utilize adjoints automatically.
+
+We've also enabled methods to efficiently compute derivatives through explicit and implicit ODE solvers (unsteady discrete adjoint).  For the implicit solve at each time step we can apply the same methodology.  However, both still face memory challenges for long time-based simulations.  We analytically propagate derivatives between time steps so that reverse mode AD tapes only need to extend across a single time step. This allows for arbitrarily long time sequences without increasing memory requirements.
+
+As a side benefit the above functionality easily allows one to define custom AD rules.  This is perhaps most useful when calling code from another language.  We provide fall backs for utilizing finite differencing and complex step efficiently if the external code cannot provide derivatives (ideally via Jacobian vector products).  This functionality can also be used for mixed-mode AD.
 
 **Author**: Andrew Ning and Taylor McDonnell
 
 **Features**:
 
-- Compatible with ForwardDiff and ReverseDiff
+- Compatible with ForwardDiff and ReverseDiff (or any ChainRules compliant reverse mode AD package)
 - Compatible with any solver (no differentiation occurs inside the solver)
 - Simple drop in functionality
-- Customizable subfunctions to accommodate different use cases
-- Version for ordinary differentiation equations (i.e., discrete adjoint)
+- Customizable subfunctions to accommodate different use cases (e.g., custom linear solvers, factorizations, matrix-free operators)
+- Version for ordinary differentiation equations (i.e., discrete unsteady adjoint)
 - Analytic overrides for linear systems (more efficient)
 - Analytic overrides for eigenvalue problems (more efficient)
 - Can provide custom rules to be inserted into the AD chain. Provides finite differencing and complex step defaults for cases where AD is not available (e.g., calling another language).
 
-**Implicit Motivation**:
-
-Many engineering analyses use implicit functions.  We can represent any such implicit function generally as:
-```math
-r(y; x) = 0
-```
-where ``r`` are the residual functions we wish to drive to zero, ``x`` are inputs, and ``y`` are the state variables, which are also outputs once the system of equations is solved.  In other words, ``y`` is an implicit function of ``x`` (``x -> r(y; x) -> y``).
-
-We then chose some appropriate solver to converge these residuals.  From a differentiation perspective, we would like to compute ``dy/dx``.  One can often use algorithmic differentiation (AD) in the same way one would for any explicit function.  Once we unroll the iterations of the solver the set of instructions is explicit.  However, this is at best inefficient and at worse inaccurate or not possible (at least not without a lot more effort).  To obtain accurate derivatives by propgating AD through a solver, the solver must be solved to a tight tolerance.  Generally tighter than is required to converge the primal values.  Sometimes this is not feasible because operations inside the solvers may not be overloaded for AD, this is especially true when calling solvers in other languages.  But even if we can do it (tight convergence is possible and everything under the hood is overloaded) we usually still shouldn't, as it would be computationally inefficient.  Instead we can use implicit differentiation, to allow for AD to work seemlessly with implicit functions without having to differentiate through them.
-
-This package provides an implementation so that a simple one-line change can be applied to allow AD to be propgated around any solver.  Note that the implementation of the solver need not be AD compatible since AD does not not occur inside the solver.  This package is overloaded for [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) and [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl).  There are also optional inputs so that subfunction behavior can be customized (e.g., preallocation, custom linear solvers, custom factorizations).
-
-**Custom Rule Motivation**:
-
-A different but related need is to propagate AD through functions that are not-AD compatible. A common example would be a call to a subfunction in another language that is part of a larger AD compatible function. This packages provides a simple wrapper to estimate the derivatives of the subfunction with finite differencing (forward or central) or complex step.  Those derivatives are then inserted into the AD chain so that the overall function seamlessly works with ForwardDiff or ReverseDiff.
-
-That same functionality is useful also in cases where a function is already AD compatible but where a more efficient rule is available.  We can provide the Jacobian or the Jacobian vector / vector Jacobian products directly.  One common example is mixed mode AD.  In this case we may have a subfunction that is most efficiently differentiated in reverse mode, but the overall function is differentiated in forward mode.  We can provide a custom rule for the subfunction which will then be inserted into the forward mode chain. 
-
 **Documentation**:
 
 - Start with the [tutorial](tutorial.md) to learn usage.
 - The API is described in the [reference](reference.md) page.
-- The math is particularly helpful for those wanting to provide their own custom subfunctions. See the [theory](theory.md) page.
+- The math is particularly helpful for those wanting to provide their own custom subfunctions. See the theory and also some scaling examples in this [PDF](https://arxiv.org/pdf/2306.15243.pdf).  A supplementary document deriving the linear and eigenvalue cases is available in this [PDF]().
 
 **Run Unit Tests**:
 
@@ -46,6 +34,10 @@ pkg> activate .
 pkg> test
 ```
 
+**Citing**:
+
+For now, please cite the following preprint.  DOI: [10.48550/arXiv.2306.15243](https://doi.org/10.48550/arXiv.2306.15243)
+
 **Other Packages**:
 
-[Nonconvex.jl](https://julianonconvex.github.io/Nonconvex.jl/stable/gradients/implicit/) and [ImplicitDifferentiation.jl](https://github.com/gdalle/ImplicitDifferentiation.jl) (a simplified version of the first package) are other prior implementations of the nonlinear case.  These two support ChainRules compatible packages and iterative linear solvers, whereas we have focused on ForwardDiff and ReverseDiff (though it will also work with ChainRules packages in reverse mode) and we support both direct and iterative solvers.  We've also added specialized rules for linear solvers, and ordinary differential equations in the form of a discrete adjoint (or discrete direct/forward mode).  [SciML](https://docs.sciml.ai/SciMLSensitivity/stable/manual/differential_equation_sensitivities/#sensitivity_diffeq) provides support for continuous adjoints of ODEs.  They have also recently added an implementation for the [nonlinear case](https://docs.sciml.ai/SciMLSensitivity/stable/manual/nonlinear_solve_sensitivities/), which looks to support a wide range of AD packages and also allows custom linear solvers.
+[Nonconvex.jl](https://julianonconvex.github.io/Nonconvex.jl/stable/gradients/implicit/) and [ImplicitDifferentiation.jl](https://github.com/gdalle/ImplicitDifferentiation.jl) are other prior implementations of the nonlinear portion of this package.  [SciML](https://docs.sciml.ai/SciMLSensitivity/stable/manual/differential_equation_sensitivities/#sensitivity_diffeq) provides support for continuous unsteady adjoints of ODEs.  They have also recently added an implementation for the [nonlinear case](https://docs.sciml.ai/SciMLSensitivity/stable/manual/nonlinear_solve_sensitivities/).