-
Couldn't load subscription status.
- Fork 10
JSO Compliance: R2N & R2DH #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSO Compliance: R2N & R2DH #172
Conversation
For R2DHOn this branch, On the master branch, (note: we have to use For R2N (with R2)On this branch, On the master branch, For R2N (with R2DH)On this branch, On the master branch, As you can see, the results are still a bit different but the number of subsolver iterations are exactly the same and the difference between the stationarity measures is of the order |
Sorry, I don’t understand the question. |
|
@MaxenceGollier I would prefer if the file names remained |
The R2DH allocation test fails on all checks. This is the case when |
Yes, that’s fine. I think you just need to change https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl/blob/master/Project.toml#L21. Maybe in a separate PR, and then rebase this one. |
|
@dpo I think you can start reviewing. |
| end | ||
| Bk = hess_op(f, xk) | ||
| λmax = opnorm(Bk) | ||
| solver.subpb.model.B = hess_op(nlp, xk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that you do not update solver.subpb.model.∇f, or perhaps I'm mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite hard to spot why I agree.
Perhaps I should change this to make it clearer (do you have suggestions ?)
The reason is that if you look how I initialize solver.subpb.model I give it the same vector as ∇fk. Hence the memory of solver.subpb.model.∇f is shared with the one of ∇fk and any modification of one changes the other. This is why we don't need to update it, it is already done when we call grad!(...).
|
Thank you, @MaxenceGollier, for this enormous work. I’ve started reviewing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MaxenceGollier! It looks good!
I have a number of minor comments.
src/LMTR_alg.jl
Outdated
| jtprod_residual!(nls, xk, Fk, ∇fk) | ||
| σmax = opnorm(Jk) | ||
| νInv = (1 + θ) * σmax^2 # ‖J'J‖ = ‖J‖² | ||
| νInv = σmax^2/θ # ‖J'J‖ = ‖J‖² |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please, just open a quick PR for LM and LMTR separately? I think it would be more organized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree.
|
@MaxenceGollier I have an issue with the type stability of the method It seems that the constructor of the julia> h = NormL0(λ)
NormL0{Float64}(0.05177434365494306)
julia> reg_nlp = RegularizedNLPModel(LBFGSModel(bpdn), h);
julia> @code_warntype R2NSolver(reg_nlp)The output reveals the following: Body::R2NSolver{Float64, _A, Vector{Float64}} where _A<:ShiftedProximableFunctionAs you can see, Julia struggles to deduce the concrete type of Ideally, it should have inferred the more specific type: Body::R2NSolver{Float64, ShiftedNormL0{Float64, Vector{Float64}, Vector{Float64}, Vector{Float64}}, Vector{Float64}}This type instability suggests that the constructor does not propagate type information effectively, which could negatively impact the performance of |
|
Sorry it should have inferred the following: Body::R2NSolver4{Float64, ShiftedNormL0{Float64, Vector{Float64}, Vector{Float64}, Vector{Float64}}, Vector{Float64}, R2Solver{Float64, ShiftedNormL0{Float64, Vector{Float64}, Vector{Float64}, Vector{Float64}}, Vector{Float64}}, RegularizedNLPModel{Float64, Vector{Float64}, R2NModel{Float64, Vector{Float64}, LBFGSOperator{Float64, Int64, LinearOperators.var"#67#69"{LinearOperators.var"#lbfgs_multiply#68", LinearOperators.LBFGSData{Float64, Int64}}, LinearOperators.var"#67#69"{LinearOperators.var"#lbfgs_multiply#68", LinearOperators.LBFGSData{Float64, Int64}}, LinearOperators.var"#67#69"{LinearOperators.var"#lbfgs_multiply#68", LinearOperators.LBFGSData{Float64, Int64}}}}, ShiftedNormL0{Float64, Vector{Float64}, Vector{Float64}, Vector{Float64}}, UnitRange{Int64}}}which is the type of the object Here are some performance optimization tips: Julia Performance Tips. |
|
Ok @MohamedLaghdafHABIBOULLAH, I will fix it. Note that this does not really matter as we don't expect to spend a lot of time in the function |
|
Should be good now with f136eaa |
|
Thanks, I think, that |
Actually, it did not solve it. I am reverting the changes now as it broke R2N. Yes |
| result = zero(T) | ||
| n = length(d) | ||
| @inbounds for i = 1:n | ||
| result += d[i]^2*dkσk[i]/2 + ∇fk[i]*d[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxenceGollier Please use the Julia formatter locally to avoid this type of comment. A reviewer should only need to focus on the code, not the formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know I could do that, sorry, will make sure to do it in my future PRs.
705a38e to
6671aed
Compare
|
@MohamedLaghdafHABIBOULLAH #156 broke this. |
|
Thanks, @MaxenceGollier! I suggest doing the same as for R2 and adding a quick function like this one: |
|
Here are the |
|
@MohamedLaghdafHABIBOULLAH Please approve if you have no further comments. |
|
Here are the |
|
@MohamedLaghdafHABIBOULLAH 069b854 should make you able to send keyword args to the subsolver ! |
|
Here are the |
|
Thank you @MaxenceGollier, please rebase your PR. |
|
Here are the |
|
Here are the |
|
Here are the |
|
Thank you, @MaxenceGollier, for this version. I compared Maxence’s implementation directly with the one on GitHub, and the results match quite well on this example: random_seed = 1234
Random.seed!(random_seed)
compound = 10
model, nls_model, sol = bpdn_model(compound, bounds = false)
# parameters
f = LBFGSModel(model)
λ = norm(grad(model, zeros(model.meta.nvar)), Inf) / 10
h = NormL0(λ)
x0 = zeros(f.meta.nvar)
# Basic R2DH (git)
[ Info: using R2DH_Spec_NM_B with subsolver = None
[ Info: iter f(x) h(x) √(ξ/ν) ρ σ ‖x‖ ‖s‖
[ Info: 1 2.0e+01 0.0e+00 1.7e+00 2.4e+00 6.1e-06 0.0e+00 3.8e+00 ↘
[ Info: 2 8.7e+00 4.4e+00 1.7e+00 1.1e+00 2.0e-06 3.8e+00 5.7e+00 ↘
[ Info: 3 4.8e-01 5.5e+00 3.5e-01 1.0e+00 6.7e-07 9.1e+00 1.3e+00 ↘
[ Info: 4 1.1e-01 5.5e+00 8.0e-02 1.0e+00 2.2e-07 9.8e+00 3.5e-01 ↘
[ Info: 5 9.3e-02 5.5e+00 1.8e-02 1.0e+00 7.5e-08 1.0e+01 8.4e-02 ↘
[ Info: 6 9.2e-02 5.5e+00 9.6e-03 1.0e+00 2.5e-08 1.0e+01 3.3e-02 ↘
[ Info: 7 9.2e-02 5.5e+00 2.4e-03 1.0e+00 8.3e-09 1.0e+01 6.8e-03 ↘
[ Info: 8 9.2e-02 5.5e+00 2.7e-04 1.0e+00 2.8e-09 1.0e+01 7.4e-04 ↘
[ Info: 9 9.2e-02 5.5e+00 1.2e-04 1.0e+00 9.2e-10 1.0e+01 5.8e-04 ↘
[ Info: 10 9.2e-02 5.5e+00 1.3e-05 3.1e-10 1.0e+01 7.1e-05
[ Info: R2DH_B: terminating with √(ξ/ν) = 1.3279891624205704e-5)
# Maxence R2DH
[ Info: using R2DH_Spec_NM_M with subsolver = None
[ Info: iter f(x) h(x) √(ξ/ν) ρ σ ‖x‖ ‖s‖ R2DH_M
[ Info: 0 2.0e+01 0.0e+00 1.7e+00 2.4e+00 6.1e-06 0.0e+00 3.8e+00 ↘
[ Info: 1 8.7e+00 4.4e+00 1.7e+00 1.1e+00 2.0e-06 3.8e+00 5.7e+00 ↘
[ Info: 2 4.8e-01 5.5e+00 3.5e-01 1.0e+00 6.7e-07 9.1e+00 1.3e+00 ↘
[ Info: 3 1.1e-01 5.5e+00 8.0e-02 1.0e+00 2.2e-07 9.8e+00 3.5e-01 ↘
[ Info: 4 9.3e-02 5.5e+00 1.8e-02 1.0e+00 7.5e-08 1.0e+01 8.4e-02 ↘
[ Info: 5 9.2e-02 5.5e+00 9.6e-03 1.0e+00 2.5e-08 1.0e+01 3.3e-02 ↘
[ Info: 6 9.2e-02 5.5e+00 2.4e-03 1.0e+00 8.3e-09 1.0e+01 6.8e-03 ↘
[ Info: 7 9.2e-02 5.5e+00 2.7e-04 1.0e+00 2.8e-09 1.0e+01 7.4e-04 ↘
[ Info: 8 9.2e-02 5.5e+00 1.2e-04 1.0e+00 9.2e-10 1.0e+01 5.8e-04 ↘
[ Info: 9 9.2e-02 5.5e+00 1.3e-05 1.0e+00 3.1e-10 1.0e+01 7.1e-05 ↘
[ Info: R2DH_M: terminating with √(ξ/ν) = 1.3279891624205704e-5
# Basic R2N (GitHub)
[ Info: using R2N_R2_B with subsolver = None
[ Info: outer inner f(x) h(x) √(ξ1/ν) √ξ ρ σ ‖x‖ ‖s‖ ‖Bₖ‖ R2N_B
[ Info: 1 2 2.0e+01 0.0e+00 3.1e+00 3.1e+00 2.4e+00 6.1e-06 0.0e+00 3.8e+00 1.0e+00 ↘
[ Info: 2 26 8.7e+00 4.4e+00 2.1e+00 1.6e+00 1.1e+00 2.0e-06 3.8e+00 5.0e+00 1.7e+00 ↘
[ Info: 3 37 2.4e+00 5.0e+00 7.7e-01 5.5e-01 1.8e+00 6.7e-07 8.7e+00 1.1e+00 1.9e+00 ↘
[ Info: 4 53 1.6e+00 5.1e+00 5.9e-01 4.0e-01 1.4e+00 2.2e-07 9.3e+00 1.7e+00 2.2e+00 ↘
[ Info: 5 48 7.2e-01 5.4e+00 4.5e-01 2.9e-01 1.6e+00 7.5e-08 9.7e+00 8.5e-01 2.4e+00 ↘
[ Info: 6 54 4.1e-01 5.4e+00 3.1e-01 2.0e-01 1.5e+00 2.5e-08 9.8e+00 8.5e-01 2.5e+00 ↘
[ Info: 7 12 1.8e-01 5.5e+00 2.5e-01 1.5e-01 1.3e+00 8.3e-09 9.9e+00 5.0e-01 2.7e+00 ↘
[ Info: 8 11 1.1e-01 5.5e+00 1.1e-01 6.8e-02 1.2e+00 2.8e-09 1.0e+01 2.3e-01 2.7e+00 ↘
[ Info: 9 8 9.5e-02 5.5e+00 4.4e-02 2.7e-02 1.2e+00 9.2e-10 1.0e+01 8.6e-02 2.7e+00 ↘
[ Info: 10 9 9.3e-02 5.5e+00 1.8e-02 1.1e-02 1.3e+00 3.1e-10 1.0e+01 3.2e-02 2.6e+00 ↘
[ Info: 11 11 9.2e-02 5.5e+00 8.9e-03 5.5e-03 1.2e+00 1.0e-10 1.0e+01 1.7e-02 2.6e+00 ↘
[ Info: 12 11 9.2e-02 5.5e+00 3.7e-03 2.3e-03 1.2e+00 3.4e-11 1.0e+01 7.2e-03 2.6e+00 ↘
[ Info: 13 11 9.2e-02 5.5e+00 1.6e-03 1.0e-03 1.2e+00 1.1e-11 1.0e+01 3.1e-03 2.6e+00 ↘
[ Info: 14 11 9.2e-02 5.5e+00 7.4e-04 4.6e-04 1.3e+00 3.8e-12 1.0e+01 1.3e-03 2.6e+00 ↘
[ Info: 15 11 9.2e-02 5.5e+00 3.4e-04 2.1e-04 1.2e+00 1.3e-12 1.0e+01 6.4e-04 2.7e+00 ↘
[ Info: 16 11 9.2e-02 5.5e+00 1.5e-04 9.3e-05 1.3e+00 4.2e-13 1.0e+01 2.9e-04 2.7e+00 ↘
[ Info: 17 1 9.2e-02 5.5e+00 7.5e-05 4.6e-05 1.4e-13 1.0e+01 2.8e-05 2.7e+00
[ Info: R2N_B: terminating with √(ξ1/ν) = 7.48260420816068e-5
# Maxence R2N
[ Info: using R2N_R2_M with subsolver = None
[ Info: outer inner f(x) h(x) √(ξ1/ν) ρ σ ‖x‖ ‖s‖ ‖B‖ R2N_M
[ Info: 0 1 2.0e+01 0.0e+00 3.1e+00 2.4e+00 6.1e-06 0.0e+00 3.8e+00 1.0e+00 ↘
[ Info: 1 25 8.7e+00 4.4e+00 2.1e+00 1.1e+00 2.0e-06 3.8e+00 5.0e+00 1.7e+00 ↘
[ Info: 2 36 2.4e+00 5.0e+00 7.7e-01 1.8e+00 6.7e-07 8.7e+00 1.1e+00 1.9e+00 ↘
[ Info: 3 52 1.6e+00 5.1e+00 5.9e-01 1.4e+00 2.2e-07 9.3e+00 1.7e+00 2.2e+00 ↘
[ Info: 4 47 7.2e-01 5.4e+00 4.5e-01 1.6e+00 7.5e-08 9.7e+00 8.5e-01 2.4e+00 ↘
[ Info: 5 53 4.1e-01 5.4e+00 3.1e-01 1.5e+00 2.5e-08 9.8e+00 8.5e-01 2.5e+00 ↘
[ Info: 6 11 1.8e-01 5.5e+00 2.5e-01 1.3e+00 8.3e-09 9.9e+00 5.0e-01 2.7e+00 ↘
[ Info: 7 10 1.1e-01 5.5e+00 1.1e-01 1.2e+00 2.8e-09 1.0e+01 2.3e-01 2.7e+00 ↘
[ Info: 8 7 9.5e-02 5.5e+00 4.4e-02 1.2e+00 9.2e-10 1.0e+01 8.6e-02 2.7e+00 ↘
[ Info: 9 8 9.3e-02 5.5e+00 1.8e-02 1.3e+00 3.1e-10 1.0e+01 3.2e-02 2.6e+00 ↘
[ Info: 10 10 9.2e-02 5.5e+00 8.9e-03 1.2e+00 1.0e-10 1.0e+01 1.7e-02 2.6e+00 ↘
[ Info: 11 10 9.2e-02 5.5e+00 3.7e-03 1.2e+00 3.4e-11 1.0e+01 7.2e-03 2.6e+00 ↘
[ Info: 12 10 9.2e-02 5.5e+00 1.6e-03 1.2e+00 1.1e-11 1.0e+01 3.1e-03 2.6e+00 ↘
[ Info: 13 10 9.2e-02 5.5e+00 7.4e-04 1.3e+00 3.8e-12 1.0e+01 1.3e-03 2.6e+00 ↘
[ Info: 14 10 9.2e-02 5.5e+00 3.4e-04 1.2e+00 1.3e-12 1.0e+01 6.4e-04 2.7e+00 ↘
[ Info: 15 10 9.2e-02 5.5e+00 1.5e-04 1.3e+00 4.2e-13 1.0e+01 2.9e-04 2.7e+00 ↘
[ Info: 16 0 9.2e-02 5.5e+00 7.5e-05 1.3e+00 1.4e-13 1.0e+01 2.9e-04 2.7e+00 ↘
[ Info: R2N_M: terminating with √(ξ1/ν) = 7.482604208161469e-5I compared Maxence’s implementation directly with the one on GitHub. Using the same solver options, we observe that the results are practically identical, and the differences in stationarity measures are very small. I believe we’re ready to merge @dpo @MaxenceGollier. |
|
Thank you both! |
|
Here are the |
|
@MohamedLaghdafHABIBOULLAH, 069b854 seems to be causing troubles for type stability. I think we might have to remove it and have it the old way, how necessary is it for you to keep it as such ? |
Fixed the merge issues from #166, I will show the benchmarks in the following comments.
Tests will fail as long as
opnormallocates.Additionnally, I need LinearOperators v2.9 for the test to pass which does not seem to be the case on freeBSD ? @dpo