Speed up SPARSE_NORMAL_CHOLESKY when using CX_SPARSE.
When using sparse cholesky factorization to solve the linear
least squares problem:
Ax = b
There are two sources of computational complexity.
1. Computing H = A'A
2. Computing the sparse Cholesky factorization of H.
Doing 1. using CX_SPARSE is particularly expensive, as it uses
a generic cs_multiply function which computes the structure of
the matrix H everytime, reallocates memory and does not take
advantage of the fact that the matrix being computed is a symmetric
outer product.
This change adds a custom symmetric outer product algorithm for
CompressedRowSparseMatrix.
It has a symbolic phase, where it computes the sparsity structure
of the output matrix and a "program" which allows the actual
multiplication routine to determine exactly which entry in the
values array each term in the product contributes to.
With these two bits of information, the outer product H = A'A
can be computed extremely fast without any reasoning about
the structure of H.
Further gains in efficiency are made by exploiting the block
structure of A.
With this change, SPARSE_NORMAL_CHOLESKY with CX_SPARSE as the
backend results in > 300% speedup for some problems.
The symbolic analysis phase of the solver is a bit more expensive
now but the increased cost is made up in 3-4 iterations.
Change-Id: I5e4a72b4d03ba41b378a2634330bc22b299c0f12
6 files changed