Benchmarks / Backends

Backends: Same Solver, Different Linear Algebra

A curated set of (solver, problem) cases, each run to a fixed iteration budget varying only the linear-algebra backend. Scaling cases plot time against problem size n on log–log axes; fixed-size cases show one bar per backend. As a solver needs richer linear algebra, fewer backends can run it; those gaps are intentional, and differ in cause. Times are criterion's mean per full solve, so lower is better.

Gradient Descent · Rosenbrock

First-order, vector tier only: runs on all four backends.

Nelder–Mead · Ackley

Derivative-free simplex, vector tier only: all four backends on a multimodal landscape.

L-BFGS · Styblinski–Tang

Quasi-Newton in compact form (no dense matrix): all four backends.

BFGS · Levy

Dense O(n²) rank-2 inverse-Hessian update. No ndarray: Array2 implements neither the general rank-1 update nor a matrix identity.

CMA-ES · Rastrigin

Stochastic, multimodal; a per-generation symmetric eigendecomposition. No ndarray (no eigensolver); Vec works via a hand-rolled cyclic-Jacobi solver.

Levenberg–Marquardt · Sparse least-squares

Sparse least-squares: damped normal equations with a sparse JᵀJ and sparse Cholesky. Only nalgebra and faer carry sparse matrices, so Vec/ndarray are out.

Gauss–Newton · Sparse least-squares

Undamped Gauss–Newton on the same sparse, zero-residual design (full column rank ⇒ JᵀJ stays SPD). nalgebra + faer only.

Measured 2026-06-02 on AMD Ryzen 9 7900 12-Core Processor (linux/x64), criterion mean per solve over a fixed 200-iteration budget (a cap: the least-squares and CMA-ES cases converge sooner). Both axes are logarithmic. Absolute times are machine-specific; compare the spread between backends within a chart, not across machines.

To watch these solvers converge interactively, try the visualizer.