diff --git a/docs/Project.toml b/docs/Project.toml
index fbc6a4185..63bdfbefd 100644
--- a/docs/Project.toml
+++ b/docs/Project.toml
@@ -5,6 +5,7 @@ BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 IncompleteLU = "40713840-3770-5561-ab4c-a76e7d0d7895"
 LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"
+ModelingToolkit = "961ee093-0014-501f-94e3-6117800e7a78"
 NonlinearSolve = "8913a72c-1f9b-4ce2-8d82-65094dcecaec"
 NonlinearSolveMINPACK = "c100e077-885d-495a-a2ea-599e143bf69d"
 SciMLNLSolve = "e9a6253c-8580-4d32-9898-8661bb511710"
diff --git a/docs/pages.jl b/docs/pages.jl
index c29d2874b..8564261bd 100644
--- a/docs/pages.jl
+++ b/docs/pages.jl
@@ -1,9 +1,11 @@
 # Put in a separate page so it can be used by SciMLDocs.jl
 
 pages = ["index.md",
-    "tutorials/getting_started.md",
-    "Tutorials" => Any["tutorials/code_optimization.md",
-        "tutorials/large_systems.md",
+    "Getting Started with Nonlinear Rootfinding in Julia" => "tutorials/getting_started.md",
+    "Tutorials" => Any[
+        "Code Optimization for Small Nonlinear Systems" => "tutorials/code_optimization.md",
+        "Handling Large Ill-Conditioned and Sparse Systems" => "tutorials/large_systems.md",
+        "Symbolic System Definition and Acceleration via ModelingToolkit" => "tutorials/modelingtoolkit.md",
         "tutorials/small_compile.md",
         "tutorials/termination_conditions.md",
         "tutorials/iterator_interface.md"],
diff --git a/docs/src/tutorials/code_optimization.md b/docs/src/tutorials/code_optimization.md
index 24d003206..daefcb855 100644
--- a/docs/src/tutorials/code_optimization.md
+++ b/docs/src/tutorials/code_optimization.md
@@ -1,6 +1,6 @@
-# [Code Optimization for Solving Nonlinear Systems](@id code_optimization)
+# [Code Optimization for Small Nonlinear Systems in Julia](@id code_optimization)
 
-## Code Optimization in Julia
+## General Code Optimization in Julia
 
 Before starting this tutorial, we recommend the reader to check out one of the
 many tutorials for optimization Julia code. The following is an incomplete
@@ -29,30 +29,145 @@ Let's start with small systems.
 
 ## Optimizing Nonlinear Solver Code for Small Systems
 
-```@example
-using NonlinearSolve, StaticArrays
+Take for example a prototypical small nonlinear solver code in its out-of-place form:
 
-f(u, p) = u .* u .- p
-u0 = @SVector[1.0, 1.0]
+```@example small_opt
+using NonlinearSolve
+
+function f(u, p) 
+    u .* u .- p
+end
+u0 = [1.0, 1.0]
 p = 2.0
-probN = NonlinearProblem(f, u0, p)
-sol = solve(probN, NewtonRaphson(), reltol = 1e-9)
+prob = NonlinearProblem(f, u0, p)
+sol = solve(prob, NewtonRaphson())
 ```
 
-## Using Jacobian Free Newton Krylov (JNFK) Methods
+We can use BenchmarkTools.jl to get more precise timings:
 
-If we want to solve the first example, without constructing the entire Jacobian
+```@example small_opt
+using BenchmarkTools
+
+@btime solve(prob, NewtonRaphson())
+```
 
-```@example
-using NonlinearSolve, LinearSolve
+Note that this way of writing the function is a shorthand for:
 
-function f!(res, u, p)
-    @. res = u * u - p
+```@example small_opt
+function f(u, p) 
+    [u[1] * u[1] - p, u[2] * u[2] - p]
 end
-u0 = [1.0, 1.0]
+```
+
+where the function `f` returns an array. This is a common pattern from things like MATLAB's `fzero`
+or SciPy's `scipy.optimize.fsolve`. However, by design it's very slow. I nthe benchmark you can see
+that there are many allocations. These allocations cause the memory allocator to take more time
+than the actual numerics itself, which is one of the reasons why codes from MATLAB and Python end up
+slow.
+
+In order to avoid this issue, we can use a non-allocating "in-place" approach. Written out by hand,
+this looks like:
+
+```@example small_opt
+function f(du, u, p) 
+    du[1] = u[1] * u[1] - p
+    du[2] = u[2] * u[2] - p
+    nothing
+end
+
+prob = NonlinearProblem(f, u0, p)
+@btime  sol = solve(prob, NewtonRaphson())
+```
+
+Notice how much faster this already runs! We can make this code even simpler by using
+the `.=` in-place broadcasting.
+
+```@example small_opt
+function f(du, u, p) 
+    du .= u .* u .- p
+end
+
+@btime  sol = solve(prob, NewtonRaphson())
+```
+
+## Further Optimizations for Small Nonlinear Solves with Static Arrays and SimpleNonlinearSolve
+
+Allocations are only expensive if they are “heap allocations”. For a more
+in-depth definition of heap allocations,
+[there are many sources online](http://net-informations.com/faq/net/stack-heap.htm).
+But a good working definition is that heap allocations are variable-sized slabs
+of memory which have to be pointed to, and this pointer indirection costs time.
+Additionally, the heap has to be managed, and the garbage controllers has to
+actively keep track of what's on the heap.
+
+However, there's an alternative to heap allocations, known as stack allocations.
+The stack is statically-sized (known at compile time) and thus its accesses are
+quick. Additionally, the exact block of memory is known in advance by the
+compiler, and thus re-using the memory is cheap. This means that allocating on
+the stack has essentially no cost!
+
+Arrays have to be heap allocated because their size (and thus the amount of
+memory they take up) is determined at runtime. But there are structures in
+Julia which are stack-allocated. `struct`s for example are stack-allocated
+“value-type”s. `Tuple`s are a stack-allocated collection. The most useful data
+structure for NonlinearSolve though is the `StaticArray` from the package
+[StaticArrays.jl](https://github.com/JuliaArrays/StaticArrays.jl). These arrays
+have their length determined at compile-time. They are created using macros
+attached to normal array expressions, for example:
+
+```@example small_opt
+using StaticArrays
+A = SA[2.0, 3.0, 5.0]
+typeof(A) # SVector{3, Float64} (alias for SArray{Tuple{3}, Float64, 1, 3})
+```
+
+Notice that the `3` after `SVector` gives the size of the `SVector`. It cannot
+be changed. Additionally, `SVector`s are immutable, so we have to create a new
+`SVector` to change values. But remember, we don't have to worry about
+allocations because this data structure is stack-allocated. `SArray`s have
+numerous extra optimizations as well: they have fast matrix multiplication,
+fast QR factorizations, etc. which directly make use of the information about
+the size of the array. Thus, when possible, they should be used.
+
+Unfortunately, static arrays can only be used for sufficiently small arrays.
+After a certain size, they are forced to heap allocate after some instructions
+and their compile time balloons. Thus, static arrays shouldn't be used if your
+system has more than ~20 variables. Additionally, only the native Julia
+algorithms can fully utilize static arrays.
+
+Let's ***optimize our nonlinear solve using static arrays***. Note that in this case, we
+want to use the out-of-place allocating form, but this time we want to output
+a static array. Doing it with broadcasting looks like:
+
+```@example small_opt
+function f_SA(u, p) 
+    u .* u .- p
+end
+u0 = SA[1.0, 1.0]
 p = 2.0
-prob = NonlinearProblem(f!, u0, p)
+prob = NonlinearProblem(f_SA, u0, p)
+@btime solve(prob, NewtonRaphson())
+```
+
+Note that only change here is that `u0` is made into a StaticArray! If we needed to write `f` out
+for a more complex nonlinear case, then we'd simply do the following:
+
+```@example small_opt
+function f_SA(u, p) 
+    SA[u[1] * u[1] - p, u[2] * u[2] - p]
+end
 
-linsolve = LinearSolve.KrylovJL_GMRES()
-sol = solve(prob, NewtonRaphson(; linsolve), reltol = 1e-9)
+@btime solve(prob, NewtonRaphson())
 ```
+
+However, notice that this did not give us a speedup but rather a slowdown. This is because many of the
+methods in NonlinearSolve.jl are designed to scale to larger problems, and thus aggressively do things
+like caching which is good for large problems but not good for these smaller problems and static arrays.
+In order to see the full benefit, we need to move to one of the methods from SimpleNonlinearSolve.jl
+which are designed for these small-scale static examples. Let's now use `SimpleNewtonRaphson`:
+
+```@example small_opt
+@btime solve(prob, SimpleNewtonRaphson())
+```
+
+And there we go, around 100ns from our starting point of almost 6μs!
\ No newline at end of file
diff --git a/docs/src/tutorials/large_systems.md b/docs/src/tutorials/large_systems.md
index 5cbb99167..c1936932e 100644
--- a/docs/src/tutorials/large_systems.md
+++ b/docs/src/tutorials/large_systems.md
@@ -1,6 +1,9 @@
-# [Solving Large Ill-Conditioned Nonlinear Systems with NonlinearSolve.jl](@id large_systems)
+# [Efficiently Solving Large Sparse Ill-Conditioned Nonlinear Systems in Julia](@id large_systems)
 
-This tutorial is for getting into the extra features of using NonlinearSolve.jl. Solving ill-conditioned nonlinear systems requires specializing the linear solver on properties of the Jacobian in order to cut down on the ``\mathcal{O}(n^3)`` linear solve and the ``\mathcal{O}(n^2)`` back-solves. This tutorial is designed to explain the advanced usage of NonlinearSolve.jl by solving the steady state stiff Brusselator partial differential equation (BRUSS) using NonlinearSolve.jl.
+This tutorial is for getting into the extra features of using NonlinearSolve.jl. Solving ill-conditioned nonlinear systems 
+requires specializing the linear solver on properties of the Jacobian in order to cut down on the ``\mathcal{O}(n^3)`` 
+linear solve and the ``\mathcal{O}(n^2)`` back-solves. This tutorial is designed to explain the advanced usage of 
+NonlinearSolve.jl by solving the steady state stiff Brusselator partial differential equation (BRUSS) using NonlinearSolve.jl.
 
 ## Definition of the Brusselator Equation
 
@@ -179,7 +182,9 @@ nothing # hide
 
 Notice that this acceleration does not require the definition of a sparsity
 pattern, and can thus be an easier way to scale for large problems. For more
-information on linear solver choices, see the [linear solver documentation](https://docs.sciml.ai/DiffEqDocs/stable/features/linear_nonlinear/#linear_nonlinear). `linsolve` choices are any valid [LinearSolve.jl](https://linearsolve.sciml.ai/dev/) solver.
+information on linear solver choices, see the 
+[linear solver documentation](https://docs.sciml.ai/DiffEqDocs/stable/features/linear_nonlinear/#linear_nonlinear). 
+`linsolve` choices are any valid [LinearSolve.jl](https://linearsolve.sciml.ai/dev/) solver.
 
 !!! note
     
diff --git a/docs/src/tutorials/modelingtoolkit.md b/docs/src/tutorials/modelingtoolkit.md
new file mode 100644
index 000000000..7d26cb30c
--- /dev/null
+++ b/docs/src/tutorials/modelingtoolkit.md
@@ -0,0 +1,123 @@
+# [Symbolic Nonlinear System Definition and Acceleration via ModelingToolkit](@id modelingtoolkit)
+
+[ModelingToolkit.jl](https://docs.sciml.ai/ModelingToolkit/dev/) is a symbolic-numeric modeling system
+for the Julia SciML ecosystem. It adds a high-level interactive interface for the numerical solvers
+which can make it easy to symbolically modify and generate equations to be solved. The basic form of
+using ModelingToolkit looks as follows:
+
+```@example mtk
+using ModelingToolkit, NonlinearSolve
+
+@variables x y z
+@parameters σ ρ β
+
+# Define a nonlinear system
+eqs = [0 ~ σ * (y - x),
+    0 ~ x * (ρ - z) - y,
+    0 ~ x * y - β * z]
+@named ns = NonlinearSystem(eqs, [x, y, z], [σ, ρ, β])
+
+u0 = [x => 1.0,
+    y => 0.0,
+    z => 0.0]
+
+ps = [σ => 10.0
+    ρ => 26.0
+    β => 8 / 3]
+
+prob = NonlinearProblem(ns, u0, ps)
+sol = solve(prob, NewtonRaphson())
+```
+
+## Symbolic Derivations of Extra Functions
+
+As a symbolic system, ModelingToolkit can be used to represent the equations and derive new forms. For example,
+let's look at the equations:
+
+```@example mtk
+equations(ns)
+```
+
+We can ask it what the Jacobian of our system is via `calculate_jacobian`:
+
+```@example mtk
+calculate_jacobian(ns)
+```
+
+We can tell MTK to generate a computable form of this analytical Jacobian via `jac = true` to help the solver
+use efficient forms:
+
+```@example mtk
+prob = NonlinearProblem(ns, u0, ps, jac = true)
+sol = solve(prob, NewtonRaphson())
+```
+
+## Symbolic Simplification of Nonlinear Systems via Tearing
+
+One of the major reasons for using ModelingToolkit is to allow structural simplification of the systems. It's very
+easy to write down a mathematical model that, in theory, could be solved more simply. Let's take a look at a quick
+system:
+
+```@example mtk
+@variables u1 u2 u3 u4 u5
+eqs = [
+    0 ~ u1 - sin(u5),
+    0 ~ u2 - cos(u1),
+    0 ~ u3 - hypot(u1, u2),
+    0 ~ u4 - hypot(u2, u3),
+    0 ~ u5 - hypot(u4, u1),
+]
+@named sys = NonlinearSystem(eqs, [u1, u2, u3, u4, u5], [])
+```
+
+If we run structural simplification, we receive the following form:
+
+```@example mtk
+sys = structural_simplify(sys)
+```
+
+```@example mtk
+equations(sys)
+```
+
+How did it do this? Let's look at the `observed` to see the relationships that it found:
+
+```@example mtk
+observed(sys)
+```
+
+Using ModelingToolkit, we can build and solve the simplified system:
+
+```@example mtk
+u0 = [u5 .=> 1.0]
+prob = NonlinearProblem(sys, u0)
+sol = solve(prob, NewtonRaphson())
+```
+
+We can then use symbolic indexing to retrieve any variable:
+
+```@example mtk
+sol[u1]
+```
+
+```@example mtk
+sol[u2]
+```
+
+```@example mtk
+sol[u3]
+```
+
+```@example mtk
+sol[u4]
+```
+
+```@example mtk
+sol[u5]
+```
+
+## Component-Based and Acausal Modeling
+
+If you're interested in building models in a component or block based form, such as seen in systems like Simulink or Modelica,
+take a deeper look at [ModelingToolkit.jl's documentation](https://docs.sciml.ai/ModelingToolkit/stable/) which goes into
+detail on such features.
\ No newline at end of file
diff --git a/docs/src/tutorials/small_compile.md b/docs/src/tutorials/small_compile.md
index aab4531b8..d7d2c2311 100644
--- a/docs/src/tutorials/small_compile.md
+++ b/docs/src/tutorials/small_compile.md
@@ -1,3 +1,54 @@
 # Faster Startup and and Static Compilation
 
-This is a stub article to be completed soon.
+In many instances one may want a very lightweight version of NonlinearSolve.jl. For this case there
+exists the solver package SimpleNonlinearSolve.jl. SimpleNonlinearSolve.jl solvers all satisfy the
+same interface as NonlinearSolve.jl, but they are designed to be simpler, lightweight, and thus
+have a faster startup time. Everything that can be done with NonlinearSolve.jl can be done with
+SimpleNonlinearSolve.jl. Thus for example, we can solve the core tutorial problem with just SimpleNonlinearSolve.jl
+as follows:
+
+```@example simple
+using SimpleNonlinearSolve
+
+f(u, p) = u .* u .- p
+u0 = [1.0, 1.0]
+p = 2.0
+prob = NonlinearProblem(f, u0, p)
+sol = solve(prob, SimpleNewtonRaphson())
+```
+
+However, there are a few downsides to SimpleNonlinearSolve's `SimpleX` style algorithms to note:
+
+1. SimpleNonlinearSolve.jl's methods are not hooked into the LinearSolve.jl system, and thus do not have
+   the ability to specify linear solvers, use sparse matrices, preconditioners, and all of the other features
+   which are required to scale for very large systems of equations.
+2. SimpleNonlinearSolve.jl's methods have less robust error handling and termination conditions, and thus
+   these methods are missing some flexibility and give worse hints for debugging.
+3. SimpleNonlinearSolve.jl's methods are focused on out-of-place support. There is some in-place support,
+   but it's designed for simple smaller systems and so some optimizations are missing.
+
+However, the major upsides of SimpleNonlinearSolve.jl are:
+
+1. The methods are optimized and non-allocating on StaticArrays
+2. The methods are minimal in compilation
+
+As such, you can use the code as shown above to have very low startup with good methods, but for more scaling and debuggability
+we recommend the full NonlinearSolve.jl. But that said,
+
+```@example simple
+using StaticArrays
+
+u0 = SA[1.0, 1.0]
+p = 2.0
+prob = NonlinearProblem(f, u0, p)
+sol = solve(prob, SimpleNewtonRaphson())
+```
+
+using StaticArrays.jl is also the fastest form for small equations, so if you know your system is small then SimpleNonlinearSolve.jl
+is not only sufficient but optimal.
+
+## Static Compilation
+
+Julia has tools for building small binaries via static compilation with [StaticCompiler.jl](https://github.com/tshort/StaticCompiler.jl).
+However, these tools are currently limited to type-stable non-allocating functions. That said, SimpleNonlinearSolve.jl's solvers are
+precisely the subset of NonlinearSolve.jl which are compatible with static compilation.
\ No newline at end of file