Skip to content

Hager-Zhang converges incorrectly within Optim.ConjugateGradient due to flatness check #175

@kbarros

Description

@kbarros

I found a case where the ConjugateGradient method of Optim.jl terminates early, at a non-stationary point, without raising an error. After some digging, the problem seems associated with LineSearches. I can't tell whether the actual bug is in in LineSearches, or in the way that Optim calls the Hager-Zhang line search. Your help in diagnosing this would be greatly appreciated.

Below is a simplified example illustrating how Hager-Zhang fails to converge correctly, given the (effective) parameters that are provided by Optim.jl:

using LineSearches

ϕ(c) = 3.042968312396456 - 832.4270136930788*c - 132807.15591801773*c^2 + 7.915421661743959e6*c^3 - 1.570284840040962e8*c^4 + 1.4221708747294645e9*c^5 - 5.970065591205604e9*c^6 + 9.405512899903618e9*c^7
(c) = -832.4270136930788 - 265614.31183603546*c + 2.3746264985231876e7*c^2 - 6.281139360163848e8*c^3 + 7.110854373647323e9*c^4 - 3.582039354723362e10*c^5 + 6.5838590299325325e10*c^6

function ϕdϕ(c)
    println("ϕ($c)=$(ϕ(c)), dϕ($c)=$((c))")
    (ϕ(c), (c))
end

c0 = 0.2
ϕ0, dϕ0 = ϕdϕ(0)
println("  ^ Is it valid to evaluate these away from c0=0.2?")

ls = HagerZhang()
res = ls(ϕ, dϕ, ϕdϕ, c0, ϕ0, dϕ0)

The printed output of this code is:

ϕ(0)=3.042968312396456, dϕ(0)=-832.4270136930788
  ^ Is it valid to evaluate these away from c0=0.2?
ϕ(0.2)=3.117411287254072, dϕ(0.2)=-505.33622492291033
ϕ(0.1)=-3.503584823341612, dϕ(0.1)=674.947830358913
ϕ(0.055223623837016025)=0.5244246783084083, dϕ(0.055223623837016025)=738.3388472434362

Note that the final value c=0.055... is not a zero point of the derivative .

The HZ line search terminates early because this "flatness" condition is hit: https://github.com/JuliaNLSolvers/LineSearches.jl/blob/master/src/hagerzhang.jl#L282-L285

However, it's not clear to me whether the underlying problem is the HZ flatness condition, or the way that the call to method.linesearch! is made in Optim: https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/utilities/perform_linesearch.jl#L41-L60).

Is it OK that the values of phi_0, dphi_0 were calculated for alpha == 0 and not state.alpha == 0.2?

To fix the bug, therefore, it seems that there are two possibilities:

  1. The HZ flatness condition needs to be modified.
  2. In Optim, prior to calling method.linesearch!, the values of phi_0, dphi_0 should be recalculated for the newly guessed state.alpha.

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions