Update documentation for Covariance
Change-Id: Ia4a7347ef8267b7107698d85fcbfc986111958dc
diff --git a/docs/source/solving.rst b/docs/source/solving.rst
index 86a931a..662887e 100644
--- a/docs/source/solving.rst
+++ b/docs/source/solving.rst
@@ -395,7 +395,7 @@
which the last `M` iterations are used to approximate the inverse Hessian
used to compute a quasi-Newton step [Nocedal]_, [ByrdNocedal]_.
-Currently Ceres Solver supports both a backtracking and interpolation
+Currently Ceres Solver supports both a backtracking and interpolation
based Armijo line search algorithm, and a sectioning / zoom interpolation
(strong) Wolfe condition line search algorithm. However, note that in order for
the assumptions underlying the ``BFGS`` and ``LBFGS`` methods to be
@@ -1968,23 +1968,68 @@
Number of threads to be used for evaluating the Jacobian and
estimation of covariance.
-.. member:: bool Covariance::Options::use_dense_linear_algebra
+.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
- Default: ``false``
+ Default: ``SPARSE_QR`` or ``DENSE_SVD``
- When ``true``, ``Eigen``'s ``JacobiSVD`` algorithm is used to
- perform the computations. It is an accurate but slow method and
- should only be used for small to moderate sized problems.
+ Ceres supports three different algorithms for covariance
+ estimation, which represent different tradeoffs in speed, accuracy
+ and reliability.
- When ``false``, ``SuiteSparse/CHOLMOD`` is used to perform the
- computation. Recent versions of ``SuiteSparse`` (>= 4.2.0) provide
- a much more efficient method for solving for rows of the covariance
- matrix. Therefore, if you are doing large scale covariance
- estimation, we strongly recommend using a recent version of
- ``SuiteSparse``.
+ 1. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
+ computations. It computes the singular value decomposition
- This setting also has an effect on how the following two options
- are interpreted.
+ .. math:: U S V^\top = J
+
+ and then uses it to compute the pseudo inverse of J'J as
+
+ .. math:: (J'J)^{\dagger} = V S^{\dagger} V^\top
+
+ It is an accurate but slow method and should only be used for
+ small to moderate sized problems. It can handle full-rank as
+ well as rank deficient Jacobians.
+
+ 2. ``SPARSE_CHOLESKY`` uses the ``CHOLMOD`` sparse Cholesky
+ factorization library to compute the decomposition :
+
+ .. math:: R^\top R = J^\top J
+
+ and then
+
+ .. math:: \left(J^\top J\right)^{-1} = \left(R^\top R\right)^{-1}
+
+ It a fast algorithm for sparse matrices that should be used when
+ the Jacobian matrix J is well conditioned. For ill-conditioned
+ matrices, this algorithm can fail unpredictabily. This is
+ because Cholesky factorization is not a rank-revealing
+ factorization, i.e., it cannot reliably detect when the matrix
+ being factorized is not of full
+ rank. ``SuiteSparse``/``CHOLMOD`` supplies a heuristic for
+ checking if the matrix is rank deficient (cholmod_rcond), but it
+ is only a heuristic and can have both false positive and false
+ negatives.
+
+ Recent versions of ``SuiteSparse`` (>= 4.2.0) provide a much more
+ efficient method for solving for rows of the covariance
+ matrix. Therefore, if you are doing ``SPARSE_CHOLESKY``, we strongly
+ recommend using a recent version of ``SuiteSparse``.
+
+ 3. ``SPARSE_QR`` uses the ``SuiteSparseQR`` sparse QR factorization
+ library to compute the decomposition
+
+ .. math::
+
+ QR &= J\\
+ \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
+
+ It is a moderately fast algorithm for sparse matrices, which at
+ the price of more time and memory than the ``SPARSE_CHOLESKY``
+ algorithm is numerically better behaved and is rank revealing,
+ i.e., it can reliably detect when the Jacobian matrix is rank
+ deficient.
+
+ Neither ``SPARSE_CHOLESKY`` or ``SPARSE_QR`` are capable of computing
+ the covariance if the Jacobian is rank deficient.
.. member:: int Covariance::Options::min_reciprocal_condition_number
@@ -2009,89 +2054,79 @@
-2.0471e+14 2.0471e+14
\end{bmatrix}
- This is not a useful result.
- The reciprocal condition number of a matrix is a measure of
- ill-conditioning or how close the matrix is to being singular/rank
- deficient. It is defined as the ratio of the smallest eigenvalue of
- the matrix to the largest eigenvalue. In the above case the
- reciprocal condition number is about :math:`10^{-16}`. Which is
- close to machine precision and even though the inverse exists, it
- is meaningless, and care should be taken to interpet the results of
- such an inversion.
+ This is not a useful result. Therefore, by default
+ :func:`Covariance::Compute` will return ``false`` if a rank
+ deficient Jacobian is encountered. How rank deficiency is detected
+ depends on the algorithm being used.
- Matrices with condition number lower than
- ``min_reciprocal_condition_number`` are considered rank deficient
- and by default Covariance::Compute will return false if it
- encounters such a matrix.
+ 1. ``DENSE_SVD``
- a. ``use_dense_linear_algebra = false``
+ .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}}
- When performing large scale sparse covariance estimation,
- computing the exact value of the reciprocal condition number is
- not possible as it would require computing the eigenvalues of
- :math:`J'J`.
+ where :math:`\sigma_{\text{min}}` and
+ :math:`\sigma_{\text{max}}` are the minimum and maxiumum
+ singular values of :math:`J` respectively.
- In this case we use cholmod_rcond, which uses the ratio of the
- smallest to the largest diagonal entries of the Cholesky
- factorization as an approximation to the reciprocal condition
- number.
+ 2. ``SPARSE_CHOLESKY``
+ .. math:: \text{cholmod_rcond} < \text{min_reciprocal_conditioner_number}
- However, care must be taken as this is a heuristic and can
- sometimes be a very crude estimate. The default value of
- ``min_reciprocal_condition_number`` has been set to a conservative
- value, and sometimes the ``Covariance::Compute`` may return false
- even if it is possible to estimate the covariance reliably. In
- such cases, the user should exercise their judgement before
- lowering the value of ``min_reciprocal_condition_number``.
+ Here cholmod_rcond is a crude estimate of the reciprocal
+ condition number of :math:`J^\top J` by using the maximum and
+ minimum diagonal entries of the Cholesky factor :math:`R`. There
+ are no theoretical guarantees associated with this test. It can
+ give false positives and negatives. Use at your own risk. The
+ default value of ``min_reciprocal_condition_number`` has been
+ set to a conservative value, and sometimes the
+ :func:`Covariance::Compute` may return false even if it is
+ possible to estimate the covariance reliably. In such cases, the
+ user should exercise their judgement before lowering the value
+ of ``min_reciprocal_condition_number``.
- b. ``use_dense_linear_algebra = true``
+ 3. ``SPARSE_QR``
- When using dense linear algebra, the user has more control in
- dealing with singular and near singular covariance matrices.
+ .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
- As mentioned above, when the covariance matrix is near singular,
- instead of computing the inverse of :math:`J'J`, the
- Moore-Penrose pseudoinverse of :math:`J'J` should be computed.
-
- If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
- e_i)`, where :math:`lambda_i` is the :math:`i^\textrm{th}`
- eigenvalue and :math:`e_i` is the corresponding eigenvector,
- then the inverse of :math:`J'J` is
-
- .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
-
- and computing the pseudo inverse involves dropping terms from
- this sum that correspond to small eigenvalues.
-
- How terms are dropped is controlled by
- `min_reciprocal_condition_number` and `null_space_rank`.
-
- If `null_space_rank` is non-negative, then the smallest
- `null_space_rank` eigenvalue/eigenvectors are dropped
- irrespective of the magnitude of :math:`\lambda_i`. If the ratio
- of the smallest non-zero eigenvalue to the largest eigenvalue in
- the truncated matrix is still below
- min_reciprocal_condition_number, then the
- `Covariance::Compute()` will fail and return `false`.
-
- Setting `null_space_rank = -1` drops all terms for which
-
- .. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
+ Here :\math:`\operatorname{rank}(J)` is the estimate of the
+ rank of `J` returned by the ``SuiteSparseQR`` algorithm. It is
+ a fairly reliable indication of rank deficiency.
.. member:: int Covariance::Options::null_space_rank
- Truncate the smallest ``null_space_rank`` eigenvectors when
- computing the pseudo inverse of :math:`J'J`.
+ When using ``DENSE_SVD``, the user has more control in dealing
+ with singular and near singular covariance matrices.
- If ``null_space_rank = -1``, then all eigenvectors with eigenvalues
- s.t.
+ As mentioned above, when the covariance matrix is near singular,
+ instead of computing the inverse of :math:`J'J`, the Moore-Penrose
+ pseudoinverse of :math:`J'J` should be computed.
- :math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
+ If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
+ e_i)`, where :math:`lambda_i` is the :math:`i^\textrm{th}`
+ eigenvalue and :math:`e_i` is the corresponding eigenvector, then
+ the inverse of :math:`J'J` is
- are dropped. See the documentation for
- ``min_reciprocal_condition_number`` for more details.
+ .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
+
+ and computing the pseudo inverse involves dropping terms from this
+ sum that correspond to small eigenvalues.
+
+ How terms are dropped is controlled by
+ `min_reciprocal_condition_number` and `null_space_rank`.
+
+ If `null_space_rank` is non-negative, then the smallest
+ `null_space_rank` eigenvalue/eigenvectors are dropped irrespective
+ of the magnitude of :math:`\lambda_i`. If the ratio of the
+ smallest non-zero eigenvalue to the largest eigenvalue in the
+ truncated matrix is still below min_reciprocal_condition_number,
+ then the `Covariance::Compute()` will fail and return `false`.
+
+ Setting `null_space_rank = -1` drops all terms for which
+
+ .. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
+
+ This option has no effect on ``SPARSE_QR`` and ``SPARSE_CHOLESKY``
+ algorithms.
.. member:: bool Covariance::Options::apply_loss_function