-
Notifications
You must be signed in to change notification settings - Fork 17
Disable gradient and Hessian backends for NLSModels #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A user can still enable them manually if they really want to.
|
It is a breaking modification but I am ok with it. |
| gradient_backend = EmptyADbackend(), | ||
| hprod_backend = EmptyADbackend(), | ||
| hessian_backend = EmptyADbackend(), | ||
| hprod_residual_backend = EmptyADbackend(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typical NLS solvers do not use hprod_residual or hessian_residual. My other motivation for deactivating those backends is that on a handwriting recognition model, AD for the Hessian of the residual was taking ages and that backend was not even needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the Hessian backends and it is very slow when we use for example OptimizationProblems and we set the problem size to very large(i.e. 10k)
One question:
Why gradient_backend = EmptyADbackend()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the gradient is J'(x) * F(x) and we practice we don't work directly with the NLP problem that has 0.5||F(x)||^2 as objective.
We exploit the NLS formulation and rely on optimization solvers like Gauss-Newton or Levenberg-Marquardt.
| jtprod_residual_backend = get_default_backend(:jtprod_residual_backend, backend), | ||
| jacobian_residual_backend = get_default_backend(:jacobian_residual_backend, backend, matrix_free), | ||
| hessian_residual_backend = get_default_backend(:hessian_residual_backend, backend, matrix_free), | ||
| hessian_residual_backend = EmptyADbackend(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this one too
Currently, we assume that only the default backends are defined for NLS models. In a future revision, it would be more robust to test the backends that are not set to EmptyBackend().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR disables gradient and Hessian backends for NLSModels by default while still allowing users to manually enable them. The change addresses potential compatibility or performance issues with these backends in NLS contexts.
- Replaces default backend assignments with
EmptyADbackend()for gradient and Hessian-related backends - Updates test exclusions to skip gradient and Hessian function tests for all NLS problems
- Reorganizes test logic to separate NLS models from regular NLP models
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/ad.jl | Disables gradient and Hessian backends by setting them to EmptyADbackend() in both ADModelNLSBackend functions |
| test/nls/nlpmodelstest.jl | Expands exclusion list to skip all gradient and Hessian-related functions instead of problem-specific exclusions |
| test/manual.jl | Restructures test logic to only test gradient/Hessian functions for non-NLS models and removes these backends from NLS model creation |
|
Thanks @dpo 👍 |
A user can still enable them manually if they really want to.