Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line Search for Newton-Raphson #194

Closed
wants to merge 26 commits into from
Closed

Conversation

yash2798
Copy link
Member

No description provided.

@codecov
Copy link

codecov bot commented Jul 11, 2023

Codecov Report

Merging #194 (f8b6685) into master (81e9164) will decrease coverage by 1.93%.
The diff coverage is 67.92%.

❗ Current head f8b6685 differs from pull request most recent head 137d8fd. Consider uploading reports for the commit 137d8fd to get more accurate results

@@            Coverage Diff             @@
##           master     #194      +/-   ##
==========================================
- Coverage   94.42%   92.49%   -1.93%     
==========================================
  Files           7        7              
  Lines         699      746      +47     
==========================================
+ Hits          660      690      +30     
- Misses         39       56      +17     
Files Changed Coverage Δ
src/NonlinearSolve.jl 100.00% <ø> (ø)
src/jacobian.jl 82.40% <0.00%> (-8.41%) ⬇️
src/utils.jl 81.42% <ø> (ø)
src/raphson.jl 91.50% <83.72%> (-5.60%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@yash2798 yash2798 changed the title [WIP] Line Search for Newton-Raphson Line Search for Newton-Raphson Jul 20, 2023
@yash2798 yash2798 marked this pull request as ready for review July 20, 2023 00:47
@@ -149,3 +149,19 @@ function jacobian_autodiff(f, x::AbstractArray, nonlinfun, alg)
jac_prototype = jac_prototype, chunksize = chunk_size),
num_of_chunks)
end

function simple_jacobian(cache, x::Number)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the nonlinear solve be responsible for providing the jacobian? What if the problem doesn't support autodiff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should I just use FiniteDiff when autodiff is passed as false ? That should be a correct approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already handled by other dispatches. Why does this exist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, however in the current dispatch, the jacobian is calculated at cache.u, i.e. the dispatch takes in cache as an argument and not the x. But in when linesearch is performed, we need the jacobian at multiple x's (for multiple step lengths $\alpha_{k}$), so we can't just do jacobian(f, cache) as in the current dispatch.

for example like here (https://github.com/JuliaNLSolvers/LineSearches.jl/blob/ded667a80f47886c77d67e8890f6adb127679ab4/src/strongwolfe.jl#L84)

i can't think of any other way to approach this, any inputs will be helpful?

Copy link
Member Author

@yash2798 yash2798 Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although i have an idea like this roughly

function simple_jacobian(cache, x)
cache.tmp_var = cache.u
cache.u = x
J = jacobian(cache, f) ## our current dispatch
cache.u = cache.tmp_var
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this way we don't have to deal with sparsity and AD/Finite diff separately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be something like that reusing the existing machinery. But the vjp shouldn't build the Jacobian at all: you don't actually need to calculate the Jacobian for most of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are absolutely right. so we need $J^{T}v = (v^{T}J)^{T}$. We can calculate vjp $v^{T}J$. are there any routines in SparseDiffTools.jl or do we need to do this kind of 'in-situ'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if in-situ, can you point me to something that i can take help from regarding reverse-mode AD, i am not very well versed with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's precisely what Reverse-Mode AD calculates without calculating the Jacobian. This is why I want @avik-pal to take a look and see how that would mesh with #206

src/raphson.jl Outdated Show resolved Hide resolved
src/raphson.jl Show resolved Hide resolved
src/raphson.jl Show resolved Hide resolved
@@ -149,3 +149,19 @@ function jacobian_autodiff(f, x::AbstractArray, nonlinfun, alg)
jac_prototype = jac_prototype, chunksize = chunk_size),
num_of_chunks)
end

function simple_jacobian(cache, x::Number)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should I just use FiniteDiff when autodiff is passed as false ? That should be a correct approach?

@ChrisRackauckas
Copy link
Member

So should I just use FiniteDiff when autodiff is passed as false ? That should be a correct approach?

Yes, an internal Jacobian function is a bit suspect. Definitely use FiniteDiff+ForwardDiff (SparseDiffTools) when possible.

@yash2798 yash2798 marked this pull request as draft July 25, 2023 01:13
@yash2798 yash2798 changed the title Line Search for Newton-Raphson [WIP] Line Search for Newton-Raphson Jul 25, 2023
src/utils.jl Outdated Show resolved Hide resolved
@yash2798
Copy link
Member Author

yash2798 commented Sep 17, 2023

@ChrisRackauckas i did some preliminary tests and they seem to perform well (similar results as NLsolve) but I have concerns regarding the efficiency (or "code performance") in general which I am highlighting.

@@ -172,6 +182,43 @@ function perform_step!(cache::NewtonRaphsonCache{false})
return nothing
end

function objective_linesearch(cache::NewtonRaphsonCache) ## returns the objective functions required in LS
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function objective_linesearch returns 3 functions and is called in a loop (so it might get called multiple times). is this an efficient approach?

Comment on lines +207 to +216
ϕ(α) = fo(u .- α .* du1)

function dϕ(α)
g!(g_o, u .- α .* du1)
return dot(g_o, -du1)
end

function ϕdϕ(α)
return (fg!(g_o, u .- α .* du1), dot(g_o, -du1))
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, these 3 functions are defined each time the perform_linesearch! function is called. Can this lead to performance issues?

@yash2798 yash2798 changed the title [WIP] Line Search for Newton-Raphson Line Search for Newton-Raphson Sep 17, 2023
@yash2798 yash2798 marked this pull request as ready for review September 17, 2023 01:22
end

function g!(g_o, x)
mul!(g_o, simple_jacobian(cache, x)', value_f(cache, x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good way to compute this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a vjp with reversemode ad?

@unpack f, p = cache

function fo(x)
return dot(value_f(cache, x), value_f(cache, x)) / 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the same function called twice?

@ChrisRackauckas
Copy link
Member

@avik-pal I assume you'll need to incorporate this into #206

@avik-pal
Copy link
Member

Can be closed in favor of #203.

@avik-pal
Copy link
Member

I incorporated the line search in that PR. It doesn't construct Jacobians and instead uses VJPs and also caches the function construction during the nonlinear solver cache construction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants