Optimize J' * J in sparse_normal_cholesky_solver.
1. Add stype to the outerproduct computation to control the output
matrix in upper or lower triangular matrix. For SuiteSparse,
upper triangular matrix is generated. SuiteSparse can directly use
this matrix format for cholesky without matrix transpose overhead.
2. Change the outerproduct computation to block multiplication. This
reduces the computation complexity for the sort in preprocessing, also
allows formulation of the block outerproduct computation as dense Eigen
block matrix multiplication.
3. Solve 32 Tango problems on Qualcomm MSM8994 Cortex-A53 (1.55GHz)
before change: 140 seconds
after change: 131 seconds
Change-Id: I8054114cef911de6a303310a448821ca296e4744
diff --git a/internal/ceres/unsymmetric_linear_solver_test.cc b/internal/ceres/unsymmetric_linear_solver_test.cc
index 640009d..95797c5 100644
--- a/internal/ceres/unsymmetric_linear_solver_test.cc
+++ b/internal/ceres/unsymmetric_linear_solver_test.cc
@@ -78,6 +78,15 @@
for (int i = 0; i < A_->num_cols(); ++i) {
crsm->mutable_col_blocks()->push_back(1);
}
+
+ // With all blocks of size 1, crsb_rows and crsb_cols are equivalent to
+ // rows and cols.
+ std::copy(crsm->rows(), crsm->rows() + crsm->num_rows() + 1,
+ std::back_inserter(*crsm->mutable_crsb_rows()));
+
+ std::copy(crsm->cols(), crsm->cols() + crsm->num_nonzeros(),
+ std::back_inserter(*crsm->mutable_crsb_cols()));
+
transformed_A.reset(crsm);
} else {
LOG(FATAL) << "Unknown linear solver : " << options.type;