diff --git a/docs/src/getting_started/fit_simulation.md b/docs/src/getting_started/fit_simulation.md index 10ee7b58a08..7929c7115c0 100644 --- a/docs/src/getting_started/fit_simulation.md +++ b/docs/src/getting_started/fit_simulation.md @@ -46,18 +46,27 @@ Now, in [the first tutorial](@ref first_sim), we assumed: > Luckily, a local guide provided us with some parameters that seem to match the system! Sadly, magical nymphs do not always show up and give us parameters. Thus in this case, -let's assume that we are just given data that is representative of the solution with -``\alpha = 1.5``, ``\beta = 1.0``, ``\gamma = 3.0``, and ``\delta = 1.0``. This data -is given over a time span of ``t_0 = 0`` to ``t_f = 10`` with data taken on both rabbits -and wolves at every ``\Delta t = 1.`` Can we figure out what the parameter values should -be directly from the data? +we will need to use Optimization.jl to optimize the model parameters to best fit some +experimental data. +We are given experimentally observed data of both rabbit and wolf populations +over a time span of ``t_0 = 0`` to ``t_f = 10`` at every ``\Delta t = 1``. +Can we figure out what the parameter values should be directly from the data? ## Solution as Copy-Pastable Code - ```@example using DifferentialEquations, Optimization, OptimizationPolyalgorithms, SciMLSensitivity using ForwardDiff, Plots +# Define experimental data +t_data = 0:10 +x_data = [1.000 2.773 6.773 0.971 1.886 6.101 1.398 1.335 4.353 3.247 1.034] +y_data = [1.000 0.259 2.015 1.908 0.323 0.629 3.458 0.508 0.314 4.547 0.906] +xy_data = vcat(x_data, y_data) + +# Plot the provided data +scatter(t_data, xy_data', label=["x Data" "y Data"]) + +# Setup the ODE function function lotka_volterra!(du, u, p, t) x, y = u α, β, δ, γ = p @@ -72,39 +81,46 @@ u0 = [1.0, 1.0] tspan = (0.0, 10.0) # LV equation parameter. p = [α, β, δ, γ] -p = [1.5, 1.0, 3.0, 1.0] +pguess = [1.0, 1.2, 2.5, 1.2] -# Setup the ODE problem, then solve -prob = ODEProblem(lotka_volterra!, u0, tspan, p) -datasol = solve(prob, saveat = 1) -data = Array(datasol) +# Set up the ODE problem with our guessed parameter values +prob = ODEProblem(lotka_volterra!, u0, tspan, pguess) -## Now do the optimization process +# Solve the ODE problem with our guessed parameter values +initial_sol = solve(prob, saveat = 1) + +# View the guessed model solution +plt = plot(initial_sol, label = ["x Prediction" "y Prediction"]) +scatter!(plt, t_data, xy_data', label = ["x Data" "y Data"]) + +# Define a loss metric function to be minimized function loss(newp) newprob = remake(prob, p = newp) sol = solve(newprob, saveat = 1) - loss = sum(abs2, sol .- data) + loss = sum(abs2, sol .- xy_data) return loss, sol end -callback = function (p, l, sol) +# Define a callback function to monitor optimization progress +function callback(p, l, sol) display(l) - plt = plot(sol, ylim = (0, 6), label = "Current Prediction") - scatter!(plt, datasol, label = "Data") + plt = plot(sol, ylim = (0, 6), label = ["Current x Prediction" "Current y Prediction"]) + scatter!(plt, t_data, xy_data', label = ["x Data" "y Data"]) display(plt) - # Tell Optimization.solve to not halt the optimization. If return true, then - # optimization stops. return false end -adtype = Optimization.AutoForwardDiff() +# Set up the optimization problem with our loss function and initial guess +adtype = AutoForwardDiff() pguess = [1.0, 1.2, 2.5, 1.2] -optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype) -optprob = Optimization.OptimizationProblem(optf, pguess) - -result_ode = Optimization.solve(optprob, PolyOpt(), - callback = callback, - maxiters = 200) +optf = OptimizationFunction((x, _) -> loss(x), adtype) +optprob = OptimizationProblem(optf, pguess) + +# Optimize the ODE parameters for best fit to our data +pfinal = solve(optprob, PolyOpt(), + callback = callback, + maxiters = 200) +α, β, γ, δ = round.(pfinal, digits=1) ``` ## Step-by-Step Solution @@ -132,31 +148,63 @@ using DifferentialEquations, Optimization, OptimizationPolyalgorithms, SciMLSens using ForwardDiff, Plots ``` -### Step 2: Generate the Training Data +### Step 2: View the Training Data + +In our example, +we are given observed values for `x` and `y` populations at eleven instances in time. +Let's make that the training data for our Lotka-Volterra dynamical system model. + +```@example odefit +# Define experimental data +t_data = 0:10 +x_data = [1.000 2.773 6.773 0.971 1.886 6.101 1.398 1.335 4.353 3.247 1.034] +y_data = [1.000 0.259 2.015 1.908 0.323 0.629 3.458 0.508 0.314 4.547 0.906] +xy_data = vcat(x_data, y_data) + +# Plot the provided data +scatter(t_data, xy_data', label=["x Data" "y Data"]) +``` + +!!! note -In our example, we assumed that we had data representative of the solution with -``\alpha = 1.5``, ``\beta = 1.0``, ``\gamma = 3.0``, and ``\delta = 1.0``. Let's make that -training data. The way we can do that is by defining the ODE with those parameters and -simulating it. Unlike [the first tutorial](@ref first_sim), which used ModelingToolkit, + The `Array` `xy_data` above has been oriented with time instances as columns + so that it can be directly compared with an `ODESolution` object. (See + [Solution Handling](https://docs.sciml.ai/DiffEqDocs/stable/basics/solution/#solution) + for more information on accessing DifferentialEquation.jl solution data.) + However, plotting an `Array` with Plots.jl requires the variables to be columns + and the time instances to be rows. + Thus, whenever the experimental data is plotted, + the transpose `xy_data'` will be used. + +### Step 3: Set Up the ODE Model + +We know that our system will behave according to the Lotka-Volterra ODE model, +so let's set up that model with an initial guess at the parameter values: +`\alpha`, `\beta`, `\gamma`, and `\delta`. +Unlike [the first tutorial](@ref first_sim), which used ModelingToolkit, let's demonstrate using [DifferentialEquations.jl](https://docs.sciml.ai/DiffEqDocs/stable/) to directly define the ODE for the numerical solvers. To do this, we define a vector-based mutating function that calculates the derivatives for our system. We will define our system as a vector `u = [x,y]`, and thus `u[1] = x` and -`u[2] = y`. This means that we need to calculate the derivative as `du = [dx,dy]`. Our parameters -will simply be the vector `p = [α, β, δ, γ]`. Writing down the Lotka-Volterra equations in the +`u[2] = y`. This means that we need to calculate the derivative as `du = [dx,dy]`. +Our parameters will simply be the vector `p = [α, β, δ, γ]`. +Writing down the Lotka-Volterra equations in the DifferentialEquations.jl direct form thus looks like the following: ```@example odefit function lotka_volterra!(du, u, p, t) x, y = u α, β, δ, γ = p - du[1] = α * x - β * x * y - du[2] = -δ * y + γ * x * y + du[1] = dx = α * x - β * x * y + du[2] = dy = -δ * y + γ * x * y end ``` -Now we need to define the initial condition, time span, and parameter vector to simulate with. +Now we need to define the initial condition, time span, and parameter vector +in order to solve this differential equation. +We do not currently know the parameter values, +but we will guess some values to start with and optimize them later. Following the problem setup, this looks like: ```@example odefit @@ -167,101 +215,186 @@ u0 = [1.0, 1.0] tspan = (0.0, 10.0) # LV equation parameter. p = [α, β, δ, γ] -p = [1.5, 1.0, 3.0, 1.0] +pguess = [1.0, 1.2, 2.5, 1.2] ``` -Now we bring these pieces all together to define the `ODEProblem` and solve it. Note that we solve -this equation with the keyword argument `saveat = 1` so that it saves a point at every ``\Delta t = 1``. +Now we bring these pieces all together to define the `ODEProblem` and solve it. +Note that we solve this equation with the keyword argument `saveat = 1` +so that it saves a point at every ``\Delta t = 1`` to match our experimental data. ```@example odefit -# Setup the ODE problem, then solve -prob = ODEProblem(lotka_volterra!, u0, tspan, p) -datasol = solve(prob, saveat = 1) -``` +# Set up the ODE problem with our guessed parameter values +prob = ODEProblem(lotka_volterra!, u0, tspan, pguess) -```@example odefit -data = Array(datasol) +# Solve the ODE problem with our guessed parameter values +initial_sol = solve(prob, saveat = 1) + +# View the guessed model solution +plt = plot(initial_sol, label = ["x Prediction" "y Prediction"]) +scatter!(plt, t_data, xy_data', label = ["x Data" "y Data"]) ``` -!!! note + Clearly the parameter values that we guessed are not correct to model this system. + However, we can use Optimization.jl together with DifferentialEquations.jl + to fit our parameters to our training data. + +!!! note For more details on using DifferentialEquations.jl, check out the [getting started with DifferentialEquations.jl tutorial](https://docs.sciml.ai/DiffEqDocs/stable/getting_started/). -### Step 3: Set Up the Cost Function for Optimization +### Step 4: Set Up the Loss Function for Optimization + +Now let's start the optimization process. +First, let's define a loss function to be minimized. +(It is also sometimes referred to as a cost function.) +For our loss function, we want to take a set of parameters, +create a new ODE which has everything the same except for the changed parameters, +solve this ODE with new parameters, and compare the ODE solution against the provided data. +In this case, the *loss* returned from the loss function is a quantification +of the difference between the current solution and the desired solution. +When this difference is minimized, our model prediction will closely approximate the observed system data. -Now let's start the estimation process. First, let's define a loss function. For our loss function, we want to -take a set of parameters, create a new ODE which has everything the same except for the changed parameters, -solve this ODE with new parameters, and compare its predictions against the data. For this parameter changing, +To change our parameter values, there is a useful functionality in the [SciML problems interface](https://docs.sciml.ai/DiffEqDocs/stable/basics/problem/#Modification-of-problem-types) called `remake` which creates a new version of an existing `SciMLProblem` with the aspect you want changed. For example, if we wanted to change the initial condition `u0` of our ODE, we could do `remake(prob, u0 = newu0)` -For our case, we want to change around just the parameters, so we can do `remake(prob, p = newp)` +For our case, we want to change around just the parameters, so we can do `remake(prob, p = newp)`. It is faster to `remake` an existing `SciMLProblem` than to create a new problem every iteration. !!! note - + `remake` can change multiple items at once by passing more keyword arguments, i.e., `remake(prob, u0 = newu0, p = newp)`. This can be used to extend the example to simultaneously learn the initial conditions and parameters! -Now use `remake` to build the cost function. After we solve the new problem, we will calculate the sum of squared errors -as our metric. The sum of squares can be quickly written in Julia via `sum(abs2,x)`. Using this information, our loss -looks like: +Now use `remake` to build the loss function. After we solve the new problem, +we will calculate the sum of squared errors as our loss metric. +The sum of squares can be quickly written in Julia via `sum(abs2,x)`. +Using this information, our loss function looks like: ```@example odefit function loss(newp) newprob = remake(prob, p = newp) sol = solve(newprob, saveat = 1) - loss = sum(abs2, sol .- data) - return loss, sol + l = sum(abs2, sol .- xy_data) + return l, sol end ``` -Notice that our loss function returns the loss value as the first return, but returns extra information (the solution at the -new parameters) as extra return arguments. We will explain why this extra return information is helpful in the next section. +Notice that our loss function returns the loss value as the first return, +but returns extra information (the ODE solution with the new parameters) +as an extra return argument. +We will explain why this extra return information is helpful in the next section. + +### Step 5: Solve the Optimization Problem -### Step 4: Solve the Optimization Problem +This step will look very similar to [the first optimization tutorial](@ref first_opt), +except now we have a new loss function `loss` which returns both the loss value +and the associated ODE solution. +(In the previous tutorial, `L` only returned the loss value.) +The `Optimization.solve` function can accept an optional callback function +to monitor the optimization process using extra arguments returned from `loss`. -This step will look very similar to [the first optimization tutorial](@ref first_opt), except now we have a new -cost function. Here we'll also define a callback to monitor the solution process, more details about callbacks in Optimization.jl can be found [here](https://docs.sciml.ai/Optimization/stable/API/solve/). -However, this time, our function returns two things. The callback syntax is always `(value being optimized, arguments of loss return)` -and thus this time the callback is given `(p, l, sol)`. See, returning the solution along with the loss as part of the -loss function is useful because we have access to it in the callback to do things like plot the current solution -against the data! Let's do that in the following way: +The callback syntax is always: + +``` +callback( + optimization variables, + the current loss value, + other arguments returned from the loss function, ... +) +``` + +In this case, we will provide the callback the arguments `(p, l, sol)`, +since it always takes the current state of the optimization first (`p`) +then the returns from the loss function (`l, sol`). +The return value of the callback function should default to `false`. +`Optimization.solve` will halt if/when the callback function returns `true` instead. +Typically the `return` statement would monitor the loss value +and stop once some criteria is reached, e.g. `return loss < 0.0001`, +but we will stop after a set number of iterations instead. +More details about callbacks in Optimization.jl can be found +[here](https://docs.sciml.ai/Optimization/stable/API/solve/). ```@example odefit -callback = function (p, l, sol) +function callback(p, l, sol) display(l) - plt = plot(sol, ylim = (0, 6), label = "Current Prediction") - scatter!(plt, datasol, label = "Data") + plt = plot(sol, ylim = (0, 6), label = ["Current x Prediction" "Current y Prediction"]) + scatter!(plt, t_data, xy_data', label = ["x Data" "y Data"]) display(plt) - # Tell Optimization.solve to not halt the optimization. If return true, then - # optimization stops. return false end ``` -Thus every step of the optimization will show us the loss and a plot of how the solution -looks vs the data at our current parameters. +With this callback function, every step of the optimization will display both the loss value and a plot of how the solution compares to the training data. -Now, just like [the first optimization tutorial](@ref first_opt), we setup our optimization -problem. To do this, we need to come up with a `pguess`, an initial condition for the parameters -which is our best guess of the true parameters. For this, we will use `pguess = [1.0, 1.2, 2.5, 1.2]`. +Now, just like [the first optimization tutorial](@ref first_opt), +we set up our `OptimizationFunction` and `OptimizationProblem`, +and then `solve` the `OptimizationProblem`. +We will initialize the `OptimizationProblem` with the same `pguess` we used +when setting up the ODE Model in Step 3. +Observe how `Optimization.solve` brings the model closer to the experimental data as it iterates towards better ODE parameter values! + +Note that we are using the `PolyOpt()` solver choice here which is discussed https://docs.sciml.ai/Optimization/dev/optimization_packages/polyopt/ since parameter estimation of non-linear differential equations is generally a non-convex problem so we want to run a stochastic algorithm (Adam) to get close to the minimum and then finish off with a quasi-newton method (L-BFGS) to find the optima. Together, this looks like: ```@example odefit -adtype = Optimization.AutoForwardDiff() -optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype) +# Set up the optimization problem with our loss function and initial guess +adtype = AutoForwardDiff() pguess = [1.0, 1.2, 2.5, 1.2] -optprob = Optimization.OptimizationProblem(optf, pguess) +optf = OptimizationFunction((x, _) -> loss(x), adtype) +optprob = OptimizationProblem(optf, pguess) + +# Optimize the ODE parameters for best fit to our data +pfinal = solve(optprob, + PolyOpt(), + callback = callback, + maxiters = 200) +α, β, γ, δ = round.(pfinal, digits=1) ``` -Now we solve the optimization problem: +!!! note -```@example odefit -result_ode = Optimization.solve(optprob, PolyOpt(), - callback = callback, - maxiters = 200) -``` + When referencing the documentation for DifferentialEquations.jl and Optimization.jl + simultaneously, note that the variables `f`, `u`, and `p` will refer to different quantities. + + DifferentialEquations.jl: + + ```math + \frac{du}{dt} = f(u,p,t) + ``` + + - `f` in `ODEProblem` is the function defining the derivative `du` in the ODE. + + Here: `lotka_volterra!` + + - `u` in `ODEProblem` contains the state variables of `f`. + + Here: `x` and `y` + + - `p` in `ODEProblem` contains the parameter variables of `f`. + + Here: `\alpha`, `\beta`, `\gamma`, and `\delta` + + - `t` is the independent (time) variable. + + Here: indirectly defined with `tspan` in `ODEProblem` and `saveat` in `solve` + + Optimization.jl: + + ```math + \min_{u} f(u,p) + ``` + + - `f` in `OptimizationProblem` is the function to minimize (optimize). + + Here: the [anonymous function](https://docs.julialang.org/en/v1/manual/functions/#man-anonymous-functions) `(x, _) -> loss(x)` + + - `u` in `OptimizationProblem` contains the state variables of `f` to be optimized. + + Here: the ODE parameters `\alpha`, `\beta`, `\gamma`, and `\delta` stored in `p` + + - `p` in `OptimizationProblem` contains any fixed + [hyperparameters](https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)) of `f`. -and the answer from the optimization is our desired parameters. + Here: our `loss` function does not require any hyperparameters, so we pass `_` for this `p`.