-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line Search for Newton-Raphson #194
Conversation
Codecov Report
@@ Coverage Diff @@
## master #194 +/- ##
==========================================
- Coverage 94.42% 92.49% -1.93%
==========================================
Files 7 7
Lines 699 746 +47
==========================================
+ Hits 660 690 +30
- Misses 39 56 +17
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
added objective func value output
simple jacobian calculation
1. step variable du1 for out of place functions. 2. added linesearch func
1. modified cache 2. final tweaks
@@ -149,3 +149,19 @@ function jacobian_autodiff(f, x::AbstractArray, nonlinfun, alg) | |||
jac_prototype = jac_prototype, chunksize = chunk_size), | |||
num_of_chunks) | |||
end | |||
|
|||
function simple_jacobian(cache, x::Number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't the nonlinear solve be responsible for providing the jacobian? What if the problem doesn't support autodiff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should I just use FiniteDiff
when autodiff
is passed as false
? That should be a correct approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already handled by other dispatches. Why does this exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, however in the current dispatch, the jacobian is calculated at cache.u
, i.e. the dispatch takes in cache
as an argument and not the x
. But in when linesearch is performed, we need the jacobian at multiple x
's (for multiple step lengths jacobian(f, cache)
as in the current dispatch.
for example like here (https://github.com/JuliaNLSolvers/LineSearches.jl/blob/ded667a80f47886c77d67e8890f6adb127679ab4/src/strongwolfe.jl#L84)
i can't think of any other way to approach this, any inputs will be helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although i have an idea like this roughly
function simple_jacobian(cache, x)
cache.tmp_var = cache.u
cache.u = x
J = jacobian(cache, f) ## our current dispatch
cache.u = cache.tmp_var
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this way we don't have to deal with sparsity and AD/Finite diff separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be something like that reusing the existing machinery. But the vjp shouldn't build the Jacobian at all: you don't actually need to calculate the Jacobian for most of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes you are absolutely right. so we need SparseDiffTools.jl
or do we need to do this kind of 'in-situ'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if in-situ, can you point me to something that i can take help from regarding reverse-mode AD, i am not very well versed with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -149,3 +149,19 @@ function jacobian_autodiff(f, x::AbstractArray, nonlinfun, alg) | |||
jac_prototype = jac_prototype, chunksize = chunk_size), | |||
num_of_chunks) | |||
end | |||
|
|||
function simple_jacobian(cache, x::Number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should I just use FiniteDiff
when autodiff
is passed as false
? That should be a correct approach?
Yes, an internal Jacobian function is a bit suspect. Definitely use FiniteDiff+ForwardDiff (SparseDiffTools) when possible. |
@ChrisRackauckas i did some preliminary tests and they seem to perform well (similar results as NLsolve) but I have concerns regarding the efficiency (or "code performance") in general which I am highlighting. |
@@ -172,6 +182,43 @@ function perform_step!(cache::NewtonRaphsonCache{false}) | |||
return nothing | |||
end | |||
|
|||
function objective_linesearch(cache::NewtonRaphsonCache) ## returns the objective functions required in LS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function objective_linesearch
returns 3 functions and is called in a loop (so it might get called multiple times). is this an efficient approach?
ϕ(α) = fo(u .- α .* du1) | ||
|
||
function dϕ(α) | ||
g!(g_o, u .- α .* du1) | ||
return dot(g_o, -du1) | ||
end | ||
|
||
function ϕdϕ(α) | ||
return (fg!(g_o, u .- α .* du1), dot(g_o, -du1)) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, these 3 functions are defined each time the perform_linesearch!
function is called. Can this lead to performance issues?
end | ||
|
||
function g!(g_o, x) | ||
mul!(g_o, simple_jacobian(cache, x)', value_f(cache, x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good way to compute this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a vjp with reversemode ad?
@unpack f, p = cache | ||
|
||
function fo(x) | ||
return dot(value_f(cache, x), value_f(cache, x)) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the same function called twice?
Can be closed in favor of #203. |
I incorporated the line search in that PR. It doesn't construct Jacobians and instead uses VJPs and also caches the function construction during the nonlinear solver cache construction. |
No description provided.