fix: update A_backup before LU to prevent stale QR fallback#901
Conversation
When the LinearSolve cache is reused across multiple solves (e.g. by NonlinearSolve updating cache.A with new Jacobians each Newton step), A_backup was never updated from its initial value (prob.A at init time). If LU factorization failed and the QR fallback fired, it would restore cache.A from the stale A_backup instead of the current matrix, causing the QR solve to operate on the wrong matrix. This led to NonlinearSolve's NewtonRaphson failing to converge on problems with singular Jacobians, producing wrong results (e.g. 1.955 instead of 2.0 in SciMLBase initialization tests). Fix: save cache.A into A_backup before each in-place LU factorization, so the QR fallback always restores the correct (current) matrix. Also updates @test_broken to @test for a SparseMatrixCSC{Float64,Int32} case that now passes with the fix. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the A_backup sync from the solve path into setproperty!(cache, :A, x). When callers (e.g. NonlinearSolve) update cache.A in-place via copyto! and then trigger setproperty! with `cache.A = cache.A`, the backup is now synced at the interface boundary rather than scattered across every LU variant. The previous approach (copyto! into A_backup before each LU) mutated prob.A through the A_backup alias, corrupting user problem data and causing basictests.jl:108 to fail (values off by 2x). The guard `x === getfield(cache, :A)` ensures copyto! only fires on the in-place re-trigger pattern, not when a new matrix object is assigned. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revised approach: sync A_backup in
|
When an ODE system is resized (e.g., via resize! callbacks), A_backup in DefaultLinearSolverInit retains its original dimensions while cache.A gets resized externally. The setproperty! method then crashes with BoundsError when trying to copyto! between mismatched sizes. Two fixes: 1. setproperty!: check size(A_backup) == size(x) before copyto!, and replace A_backup with copy(x) when sizes differ. 2. New Base.resize!(cache::LinearCache, i): proactively resize A_backup when integrators call resize!, and mark cache as stale. Fixes regression from SciML#901 that broke OrdinaryDiffEq resize tests. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
A_backupcausing QR fallback to use wrong matrix when LinearSolve cache is reused across multiple solvesA_backupwas set once fromprob.Aat init time and never updatedcache.Awith new Jacobians viacopyto!, the QR fallback would restore from the stale initial matrix instead of the current oneRoot Cause
Introduced in v3.59.0 (commit b477d98, "Restore cache.A from prob.A before QR safety fallback"). The
A_backupfield inDefaultLinearSolverInitstored a reference toprob.Aand was never updated whencache.Awas modified externally.When NonlinearSolve's
NewtonRaphsonsolves a system with a singular Jacobian:cache.Awith the current Jacobian viacopyto!copyto!(cache.A, A_backup)— restoring the initial Jacobian, not the current oneThis caused SciMLBase's downstream
initialization.jltests to fail:NewtonRaphsoncomputedu0[2] = 1.955instead of2.0.Fix
Before each in-place LU factorization, save the current
cache.AintoA_backup:The
===guard skips the copy whenalias_A=true(same object), preserving existing alias detection logic.Applied to all 5 LU variant blocks (LU, GenericLU, MKL, AppleAccelerate → shared block; RFLU; BLIS; CUDA; Metal).
Test plan
default_algs.jltests pass@test_brokenforSparseMatrixCSC{Float64, Int32}now passes (updated to@test)NewtonRaphsonconverges correctly on the SciMLBase initialization scenario🤖 Generated with Claude Code