Benchmarks / Backends
Backends — same solver, different linear algebra
A curated set of (solver, problem) cases, each run to a fixed iteration
budget varying only the linear-algebra backend. Scaling cases plot time
against problem size n on log–log axes;
fixed-size cases show one bar per backend. As a solver needs richer
linear algebra, fewer backends can run it — those gaps are intentional,
and differ in cause. Times are criterion's mean per full solve, so
lower is better.
Gradient Descent · Rosenbrock
First-order, vector tier only — runs on all four backends.
Nelder–Mead · Ackley
Derivative-free simplex, vector tier only — all four backends on a multimodal landscape.
L-BFGS · Styblinski–Tang
Quasi-Newton in compact form (no dense matrix) — all four backends.
BFGS · Levy
Dense O(n²) rank-2 inverse-Hessian update. No ndarray: Array2 implements neither the general rank-1 update nor a matrix identity.
CMA-ES · Rastrigin
Stochastic, multimodal; a per-generation symmetric eigendecomposition. No ndarray (no eigensolver); Vec works via a hand-rolled cyclic-Jacobi solver.
Levenberg–Marquardt · Sparse least-squares
Sparse least-squares: damped normal equations with a sparse JᵀJ and sparse Cholesky. Only nalgebra and faer carry sparse matrices, so Vec/ndarray are out.
Gauss–Newton · Sparse least-squares
Undamped Gauss–Newton on the same sparse, zero-residual design (full column rank ⇒ JᵀJ stays SPD). nalgebra + faer only.
Measured 2026-06-02 on AMD Ryzen 9 7900 12-Core Processor (linux/x64), criterion mean per solve over a fixed 200-iteration budget (a cap — the least-squares and CMA-ES cases converge sooner). Both axes are logarithmic. Absolute times are machine-specific — compare the spread between backends within a chart, not across machines.
To watch these solvers converge interactively, try the visualizer.