-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add R2N, R2DH #153
base: master
Are you sure you want to change the base?
Add R2N, R2DH #153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update LM in a separate PR. It's unrelated to R2N.
src/LM_alg.jl
Outdated
@@ -244,6 +248,7 @@ function LM( | |||
jtprod_residual!(nls, xk, Fk, ∇fk) | |||
|
|||
σmax = opnorm(Jk) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't add redundant blank lines.
src/LM_alg.jl
Outdated
@@ -252,6 +257,7 @@ function LM( | |||
σk = σk * γ | |||
end | |||
νInv = (1 + θ) * (σmax^2 + σk) # ‖J'J + σₖ I‖ = ‖J‖² + σₖ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not here either.
4700798
to
b6cd089
Compare
@dpo here I add a test in line 194 to discuss the case if |
@dpo il est à noter que je me suis permis de changer R2_alg.jl afin de pouvoir récupérer le temps pour chaque iter et tracer la courbe objective vs temps |
|
acaad65
to
652701d
Compare
e43dbc0
to
1500d76
Compare
src/R2DH.jl
Outdated
|
||
* `x0::AbstractVector`: an initial guess (in the first calling form: default = `nlp.meta.x0`) | ||
* `selected::AbstractVector{<:Integer}`: (default `1:length(x0)`). | ||
* `Bk`: initial diagonal Hessian approximation (default: `(one(R) / options.ν) * I`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cette interface ne prend pas ce kwarg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix.
src/R2DH.jl
Outdated
D.d .= summation ? D.d .+ σk : D.d .* σk | ||
DNorm = norm(D.d, Inf) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all duplicate blank lines.
src/R2DH.jl
Outdated
Hobj_hist[k] = hk | ||
Mmonotone > 0 && (FHobj_hist[mod(k-1, Mmonotone) + 1] = fk + hk) | ||
|
||
D.d .= max.(D.d, eps(R)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do this?
src/R2_alg.jl
Outdated
@@ -21,6 +21,7 @@ mutable struct R2Solver{ | |||
Fobj_hist::Vector{R} | |||
Hobj_hist::Vector{R} | |||
Complex_hist::Vector{Int} | |||
Time_hist::Vector{R} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On veut se débarrasser de Fobj_hist
, etc. Les callbacks sont là pour ça. Il vaut mieux utiliser le callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok je vais ouvrir une issue pour ça
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t change R2 in this PR.
src/R2_alg.jl
Outdated
@@ -21,6 +21,7 @@ mutable struct R2Solver{ | |||
Fobj_hist::Vector{R} | |||
Hobj_hist::Vector{R} | |||
Complex_hist::Vector{Int} | |||
Time_hist::Vector{R} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t change R2 in this PR.
@@ -9,6 +9,7 @@ mutable struct ROSolverOptions{R} | |||
maxIter::Int # maximum amount of inner iterations | |||
maxTime::Float64 #maximum time allotted to the algorithm in s | |||
σmin::R # minimum σk allowed for LM/R2 method | |||
σk::R # initial σk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LM did not need this. R2N and LM should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In contrast to the old code, we need an initialization
6cb845f
to
3245e55
Compare
src/R2DH.jl
Outdated
min φ(s; xₖ) + ψ(s; xₖ) | ||
|
||
where φ(s ; xₖ) = f(xₖ) + ∇f(xₖ)ᵀs + ½ sᵀ (σₖ+Dₖ) s is a quadratic approximation of f about xₖ, | ||
ψ(s; xₖ) = h(xₖ + s), ‖⋅‖ is a user-defined norm, Dₖ is a diagonal Hessian approximation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s no norm here.
src/R2DH.jl
Outdated
|
||
* `x0::AbstractVector`: an initial guess (in the first calling form: default = `nlp.meta.x0`) | ||
* `selected::AbstractVector{<:Integer}`: (default `1:length(x0)`). | ||
* `Bk`: initial diagonal Hessian approximation (default: `(one(R) / options.ν) * I`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix.
src/R2DH.jl
Outdated
∇f!(∇fk, xk) | ||
∇fk⁻ = copy(∇fk) | ||
spectral_test = isa(D, SpectralGradient) | ||
# D.d .= D.d .+ σk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove dead code.
src/R2DH.jl
Outdated
end | ||
mks = mk(s) | ||
|
||
if mks < -1e5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is triggered when
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand why we would need that. Firstly, -1e5 is completely arbitrary. And secondly, why would the iteration be unsuccessful if the model decrease is large?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I agree that -1e5 might be somewhat arbitrary. Should we set it to -Inf instead?
We should also ensure that the ShiftedProximalOperators handle this appropriately when encountering
My concern is that Julia might struggle with calculus involving -Inf and interpret it as NaN. However, this does not seem to be the case, except in scenarios like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be -Inf, and ShiftedProximalOperators should return -Inf.
Thanks @dpo for the comments. I tried to incorporate and answer all your comments. |
src/R2DH.jl
Outdated
end | ||
mks = mk(s) | ||
|
||
if mks < -1e5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be -Inf, and ShiftedProximalOperators should return -Inf.
src/R2N.jl
Outdated
|
||
min f(x) + h(x) | ||
|
||
where f: ℝⁿ → ℝ is C¹ and h: ℝⁿ → ℝ is lower semi-continuous and proper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and prox-bounded.
This PR has been open for a while now, can we merge it ? @MohamedLaghdafHABIBOULLAH @dpo |
I would much prefer to review one PR instead of two. Could you please put your work together into a single PR? Maybe just close this one and open a new one?! |
@MohamedLaghdafHABIBOULLAH Please rebase this branch against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MohamedLaghdafHABIBOULLAH Let’s please finish this fast.
src/R2DH.jl
Outdated
ν = 1 / ((DNorm + σk) * (1 + θ)) | ||
@. mν∇fk = -ν * ∇fk | ||
continue | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this bit for? If it is to increase σk until the prox returns a finite value, there should be a while
. Otherwise, please remove it. It doesn’t appear in any other solver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is occurs when mks = -Inf: so we update DNorm and ν for spectral case.
Because there is a distinction between iprox in spectral (which is a prox) and other regularizers (e.g., PSB, Andrei), as rank and nuclear norm regularizers do not have an iprox.
src/R2DH.jl
Outdated
if mks == -Inf | ||
σk = σk * γ | ||
Dkσk .= D.d .+ σk | ||
DNorm = norm(D.d, Inf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DNorm = norm(D.d, Inf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason #153 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D has not changed here. I don’t see why you would recompute its norm.
@dpo, are we planning to merge this first in the end ?
Some of your comments are already taken care of in my version
|
Co-authored-by: Dominique <[email protected]>
precise when to debug Co-authored-by: Dominique <[email protected]>
correct $\Delta_{mod}$ Co-authored-by: Dominique <[email protected]>
Add some unit tests for R2DH/R2N/R2N_R2DH
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
I will add (-Inf) condition in ShiftedProximalOperators as well
78aa861
to
f817b6c
Compare
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Maxence Gollier <[email protected]>
Add a documentation for second calling form of R2DH
Thanks very much @dpo and @MaxenceGollier for the comments and commit suggestions. Incorporated all suggestions in this PR. |
@MaxenceGollier What are the errors in R2DH? |
# take first proximal gradient step s1 and see if current xk is nearly stationary | ||
# s1 minimizes φ1(s) + ‖s‖² / 2 / ν + ψ(s) ⟺ s1 ∈ prox{νψ}(-ν∇φ1(0)). | ||
|
||
subsolver_options.ν = 1 / νInv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, R2N should initialize subsolver_options.σk
if the subsolver is R2DH.
R2DH should receive an initial σk
and compute ν
based on that (currently, it does it the other way around).
|
||
subsolver_options.ϵa = k == 1 ? 1.0e-3 : min(sqrt_ξ1_νInv ^ (1.5) , sqrt_ξ1_νInv * 1e-3) | ||
verbose > 0 && @debug "setting inner stopping tolerance to" subsolver_options.optTol | ||
subsolver_args = subsolver == R2DH ? (SpectralGradient(νInv, f.meta.nvar),) : () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dpo what do you think about this line?
If I consider
mk(d) = φ(d) + ψ(d) | ||
|
||
if spectral_test | ||
prox!(s, ψ, mν∇fk, ν) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn’t correspond to R2DH; I think this should use ν = 1 / (DNorm + σk)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the paper, we say that R2DH is R2N where ν = θ / (DNorm + σk)
where θ
is close to one.
@dpo @MaxenceGollier, if there are no further comments, we can proceed with the merge. |
Add still allocating versions of R2N and R2DH.
Update LM to give the possibility to use R2DH as subsolver