Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line Search for Newton-Raphson #194

Closed
wants to merge 26 commits into from
Closed
Show file tree
Hide file tree
Changes from 25 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ DiffEqBase = "2b5f629d-d688-5b77-993f-72d75c75574e"
EnumX = "4e289a0a-7415-4d19-859d-a7e5c4648b56"
FiniteDiff = "6a86dc24-6348-571c-b903-95158fe2bd41"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
LineSearches = "d3d80556-e9d4-5f37-9878-2ab0fcc64255"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Expand Down
4 changes: 3 additions & 1 deletion src/NonlinearSolve.jl
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import ArrayInterface
import LinearSolve
using DiffEqBase
using SparseDiffTools
using LineSearches

@reexport using SciMLBase
using SciMLBase: NLStats
Expand Down Expand Up @@ -65,7 +66,8 @@ PrecompileTools.@compile_workload begin
end

export RadiusUpdateSchemes

export LineSearches
export NewtonRaphson, TrustRegion, LevenbergMarquardt


end # module
16 changes: 16 additions & 0 deletions src/jacobian.jl
Original file line number Diff line number Diff line change
Expand Up @@ -183,3 +183,19 @@
jac_prototype = jac_prototype, chunksize = chunk_size),
num_of_chunks)
end

function simple_jacobian(cache, x::Number)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the nonlinear solve be responsible for providing the jacobian? What if the problem doesn't support autodiff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should I just use FiniteDiff when autodiff is passed as false ? That should be a correct approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already handled by other dispatches. Why does this exist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, however in the current dispatch, the jacobian is calculated at cache.u, i.e. the dispatch takes in cache as an argument and not the x. But in when linesearch is performed, we need the jacobian at multiple x's (for multiple step lengths $\alpha_{k}$), so we can't just do jacobian(f, cache) as in the current dispatch.

for example like here (https://github.com/JuliaNLSolvers/LineSearches.jl/blob/ded667a80f47886c77d67e8890f6adb127679ab4/src/strongwolfe.jl#L84)

i can't think of any other way to approach this, any inputs will be helpful?

Copy link
Member Author

@yash2798 yash2798 Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although i have an idea like this roughly

function simple_jacobian(cache, x)
cache.tmp_var = cache.u
cache.u = x
J = jacobian(cache, f) ## our current dispatch
cache.u = cache.tmp_var
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this way we don't have to deal with sparsity and AD/Finite diff separately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be something like that reusing the existing machinery. But the vjp shouldn't build the Jacobian at all: you don't actually need to calculate the Jacobian for most of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are absolutely right. so we need $J^{T}v = (v^{T}J)^{T}$. We can calculate vjp $v^{T}J$. are there any routines in SparseDiffTools.jl or do we need to do this kind of 'in-situ'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if in-situ, can you point me to something that i can take help from regarding reverse-mode AD, i am not very well versed with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's precisely what Reverse-Mode AD calculates without calculating the Jacobian. This is why I want @avik-pal to take a look and see how that would mesh with #206

@unpack f, p = cache
g = Base.Fix2(f, p)
ForwardDiff.derivative(g, x)

Check warning on line 190 in src/jacobian.jl

View check run for this annotation

Codecov / codecov/patch

src/jacobian.jl#L187-L190

Added lines #L187 - L190 were not covered by tests
end

function simple_jacobian(cache, x::AbstractArray{<:Number})
@unpack f, fu, p, prob = cache
if !SciMLBase.isinplace(prob)
g = Base.Fix2(f, p)
return ForwardDiff.jacobian(g, x)

Check warning on line 197 in src/jacobian.jl

View check run for this annotation

Codecov / codecov/patch

src/jacobian.jl#L193-L197

Added lines #L193 - L197 were not covered by tests
else
return ForwardDiff.jacobian((fu, x) -> f(fu, x, p), fu, x)

Check warning on line 199 in src/jacobian.jl

View check run for this annotation

Codecov / codecov/patch

src/jacobian.jl#L199

Added line #L199 was not covered by tests
end
end
92 changes: 77 additions & 15 deletions src/raphson.jl
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
- `diff_type`: the type of finite differencing used if `autodiff = false`. Defaults to
`Val{:forward}` for forward finite differences. For more details on the choices, see the
[FiniteDiff.jl](https://github.com/JuliaDiff/FiniteDiff.jl) documentation.
- `linesearch`: the line search algorithm used. Defaults to no line search being used.
- `linsolve`: the [LinearSolve.jl](https://github.com/SciML/LinearSolve.jl) used for the
linear solves within the Newton method. Defaults to `nothing`, which means it uses the
LinearSolve.jl default algorithm choice. For more information on available algorithm
Expand All @@ -48,29 +49,33 @@
Currently, the linear solver and chunk size choice only applies to in-place defined
`NonlinearProblem`s. That is expected to change in the future.
"""
struct NewtonRaphson{CS, AD, FDT, L, P, ST, CJ} <:
struct NewtonRaphson{CS, AD, FDT, L, P, ST, CJ, LS} <:
AbstractNewtonAlgorithm{CS, AD, FDT, ST, CJ}
linsolve::L
precs::P
linesearch::LS
end

function NewtonRaphson(; chunk_size = Val{0}(), autodiff = Val{true}(),
standardtag = Val{true}(), concrete_jac = nothing,
diff_type = Val{:forward}, linsolve = nothing, precs = DEFAULT_PRECS)
diff_type = Val{:forward}, linesearch = LineSearches.Static(), linsolve = nothing, precs = DEFAULT_PRECS)
NewtonRaphson{_unwrap_val(chunk_size), _unwrap_val(autodiff), diff_type,
typeof(linsolve), typeof(precs), _unwrap_val(standardtag),
_unwrap_val(concrete_jac)}(linsolve,
precs)
_unwrap_val(concrete_jac), typeof(linesearch)}(linsolve,
precs, linesearch)
end

mutable struct NewtonRaphsonCache{iip, fType, algType, uType, duType, resType, pType,
mutable struct NewtonRaphsonCache{iip, fType, algType, uType, duType, resType, pType, foType, goType,
INType, tolType,
probType, ufType, L, jType, JC}
f::fType
alg::algType
u::uType
fu::resType
p::pType
α::Real
f_o::foType # stores value of objective
yash2798 marked this conversation as resolved.
Show resolved Hide resolved
g_o::goType # stores grad of objective at x
uf::ufType
linsolve::L
J::jType
Expand All @@ -85,27 +90,27 @@
stats::NLStats

function NewtonRaphsonCache{iip}(f::fType, alg::algType, u::uType, fu::resType,
p::pType, uf::ufType, linsolve::L, J::jType,
p::pType, α::Real, f_o::foType, g_o::goType, uf::ufType, linsolve::L, J::jType,
du1::duType,
jac_config::JC, force_stop::Bool, maxiters::Int,
internalnorm::INType,
retcode::SciMLBase.ReturnCode.T, abstol::tolType,
prob::probType,
stats::NLStats) where {
iip, fType, algType, uType,
duType, resType, pType, INType,
duType, resType, pType, foType, goType, INType,
tolType,
probType, ufType, L, jType, JC}
new{iip, fType, algType, uType, duType, resType, pType, INType, tolType,
probType, ufType, L, jType, JC}(f, alg, u, fu, p,
new{iip, fType, algType, uType, duType, resType, pType, foType, goType, INType, tolType,
probType, ufType, L, jType, JC}(f, alg, u, fu, p, α, f_o, g_o,
uf, linsolve, J, du1, jac_config,
force_stop, maxiters, internalnorm,
retcode, abstol, prob, stats)
end
end

function jacobian_caches(alg::NewtonRaphson, f, u, p, ::Val{false})
JacobianWrapper(f, p), nothing, nothing, nothing, nothing
JacobianWrapper(f, p), nothing, nothing, zero(u), nothing
end

function SciMLBase.__init(prob::NonlinearProblem{uType, iip}, alg::NewtonRaphson,
Expand All @@ -122,6 +127,9 @@
end
f = prob.f
p = prob.p
α = 1.0 #line search coefficient
f_o = 0.0
g_o = zero(u)
if iip
fu = zero(u)
f(fu, u, p)
Expand All @@ -130,7 +138,7 @@
end
uf, linsolve, J, du1, jac_config = jacobian_caches(alg, f, u, p, Val(iip))

return NewtonRaphsonCache{iip}(f, alg, u, fu, p, uf, linsolve, J, du1, jac_config,
return NewtonRaphsonCache{iip}(f, alg, u, fu, p, α, f_o, g_o, uf, linsolve, J, du1, jac_config,
false, maxiters, internalnorm,
ReturnCode.Default, abstol, prob, NLStats(1, 0, 0, 0, 0))
end
Expand All @@ -139,13 +147,13 @@
@unpack u, fu, f, p, alg = cache
@unpack J, linsolve, du1 = cache
jacobian!(J, cache)

# u = u - J \ fu
linres = dolinsolve(alg.precs, linsolve, A = J, b = _vec(fu), linu = _vec(du1),
p = p, reltol = cache.abstol)
cache.linsolve = linres.cache
@. u = u - du1
f(fu, u, p)
perform_linesearch!(cache)
@. cache.u = cache.u - cache.α * cache.du1
f(cache.fu, cache.u, p)

if cache.internalnorm(cache.fu) < cache.abstol
cache.force_stop = true
Expand All @@ -160,7 +168,9 @@
function perform_step!(cache::NewtonRaphsonCache{false})
@unpack u, fu, f, p = cache
J = jacobian(cache, f)
cache.u = u - J \ fu
cache.du1 = J \ fu
yash2798 marked this conversation as resolved.
Show resolved Hide resolved
perform_linesearch!(cache)
cache.u = cache.u - cache.α * cache.du1
cache.fu = f(cache.u, p)
if iszero(cache.fu) || cache.internalnorm(cache.fu) < cache.abstol
cache.force_stop = true
Expand All @@ -172,6 +182,43 @@
return nothing
end

function objective_linesearch(cache::NewtonRaphsonCache) ## returns the objective functions required in LS
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function objective_linesearch returns 3 functions and is called in a loop (so it might get called multiple times). is this an efficient approach?

@unpack f, p = cache

function fo(x)
return dot(value_f(cache, x), value_f(cache, x)) / 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the same function called twice?

end

function g!(g_o, x)
mul!(g_o, simple_jacobian(cache, x)', value_f(cache, x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good way to compute this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a vjp with reversemode ad?

g_o

Check warning on line 194 in src/raphson.jl

View check run for this annotation

Codecov / codecov/patch

src/raphson.jl#L193-L194

Added lines #L193 - L194 were not covered by tests
end

function fg!(g_o, x)
g!(g_o, x)
fo(x)

Check warning on line 199 in src/raphson.jl

View check run for this annotation

Codecov / codecov/patch

src/raphson.jl#L198-L199

Added lines #L198 - L199 were not covered by tests
end
return fo, g!, fg!
end

function perform_linesearch!(cache::NewtonRaphsonCache)
@unpack u, fu, du1, alg, α, g_o = cache
fo, g!, fg! = objective_linesearch(cache)
ϕ(α) = fo(u .- α .* du1)

function dϕ(α)
g!(g_o, u .- α .* du1)
return dot(g_o, -du1)

Check warning on line 211 in src/raphson.jl

View check run for this annotation

Codecov / codecov/patch

src/raphson.jl#L210-L211

Added lines #L210 - L211 were not covered by tests
end

function ϕdϕ(α)
return (fg!(g_o, u .- α .* du1), dot(g_o, -du1))

Check warning on line 215 in src/raphson.jl

View check run for this annotation

Codecov / codecov/patch

src/raphson.jl#L215

Added line #L215 was not covered by tests
end
Comment on lines +210 to +219
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, these 3 functions are defined each time the perform_linesearch! function is called. Can this lead to performance issues?

cache.f_o = fo(u)
@unpack f_o = cache
cache.α, cache.f_o = alg.linesearch(ϕ, dϕ, ϕdϕ, 1.0, f_o, dot(du1, g_o))
end

function SciMLBase.solve!(cache::NewtonRaphsonCache)
while !cache.force_stop && cache.stats.nsteps < cache.maxiters
perform_step!(cache)
Expand Down Expand Up @@ -207,3 +254,18 @@
cache.retcode = ReturnCode.Default
return cache
end

## some helper func ###

function value_f(cache::NewtonRaphsonCache, x)
iip = SciMLBase.isinplace(cache.prob)
@unpack f, p = cache
if iip
res = zero(x)
f(res, x, p)
else
res = f(x, p)
end
res
end

3 changes: 3 additions & 0 deletions src/utils.jl
Original file line number Diff line number Diff line change
Expand Up @@ -157,3 +157,6 @@ function rfunc(r::R, c2::R, M::R, γ1::R, γ2::R, β::R) where {R <: Real} # R-f
return (1 - γ1 - β) * (exp(r - c2) + β / (1 - γ1 - β))
end
end



Loading