Update documentation for Covariance

Change-Id: Ia4a7347ef8267b7107698d85fcbfc986111958dc
diff --git a/docs/source/solving.rst b/docs/source/solving.rst
index 86a931a..662887e 100644
--- a/docs/source/solving.rst
+++ b/docs/source/solving.rst
@@ -395,7 +395,7 @@
    which the last `M` iterations are used to approximate the inverse Hessian
    used to compute a quasi-Newton step [Nocedal]_, [ByrdNocedal]_.
 
-Currently Ceres Solver supports both a backtracking and interpolation 
+Currently Ceres Solver supports both a backtracking and interpolation
 based Armijo line search algorithm, and a sectioning / zoom interpolation
 (strong) Wolfe condition line search algorithm.  However, note that in order for
 the assumptions underlying the ``BFGS`` and ``LBFGS`` methods to be
@@ -1968,23 +1968,68 @@
    Number of threads to be used for evaluating the Jacobian and
    estimation of covariance.
 
-.. member:: bool Covariance::Options::use_dense_linear_algebra
+.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
 
-   Default: ``false``
+   Default: ``SPARSE_QR`` or ``DENSE_SVD``
 
-   When ``true``, ``Eigen``'s ``JacobiSVD`` algorithm is used to
-   perform the computations. It is an accurate but slow method and
-   should only be used for small to moderate sized problems.
+   Ceres supports three different algorithms for covariance
+   estimation, which represent different tradeoffs in speed, accuracy
+   and reliability.
 
-   When ``false``, ``SuiteSparse/CHOLMOD`` is used to perform the
-   computation. Recent versions of ``SuiteSparse`` (>= 4.2.0) provide
-   a much more efficient method for solving for rows of the covariance
-   matrix. Therefore, if you are doing large scale covariance
-   estimation, we strongly recommend using a recent version of
-   ``SuiteSparse``.
+   1. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
+      computations. It computes the singular value decomposition
 
-   This setting also has an effect on how the following two options
-   are interpreted.
+      .. math::   U S V^\top = J
+
+      and then uses it to compute the pseudo inverse of J'J as
+
+      .. math::   (J'J)^{\dagger} = V  S^{\dagger}  V^\top
+
+      It is an accurate but slow method and should only be used for
+      small to moderate sized problems. It can handle full-rank as
+      well as rank deficient Jacobians.
+
+   2. ``SPARSE_CHOLESKY`` uses the ``CHOLMOD`` sparse Cholesky
+      factorization library to compute the decomposition :
+
+      .. math::   R^\top R = J^\top J
+
+      and then
+
+      .. math::   \left(J^\top J\right)^{-1}  = \left(R^\top R\right)^{-1}
+
+      It a fast algorithm for sparse matrices that should be used when
+      the Jacobian matrix J is well conditioned. For ill-conditioned
+      matrices, this algorithm can fail unpredictabily. This is
+      because Cholesky factorization is not a rank-revealing
+      factorization, i.e., it cannot reliably detect when the matrix
+      being factorized is not of full
+      rank. ``SuiteSparse``/``CHOLMOD`` supplies a heuristic for
+      checking if the matrix is rank deficient (cholmod_rcond), but it
+      is only a heuristic and can have both false positive and false
+      negatives.
+
+      Recent versions of ``SuiteSparse`` (>= 4.2.0) provide a much more
+      efficient method for solving for rows of the covariance
+      matrix. Therefore, if you are doing ``SPARSE_CHOLESKY``, we strongly
+      recommend using a recent version of ``SuiteSparse``.
+
+   3. ``SPARSE_QR`` uses the ``SuiteSparseQR`` sparse QR factorization
+      library to compute the decomposition
+
+       .. math::
+
+          QR &= J\\
+          \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
+
+      It is a moderately fast algorithm for sparse matrices, which at
+      the price of more time and memory than the ``SPARSE_CHOLESKY``
+      algorithm is numerically better behaved and is rank revealing,
+      i.e., it can reliably detect when the Jacobian matrix is rank
+      deficient.
+
+   Neither ``SPARSE_CHOLESKY`` or ``SPARSE_QR`` are capable of computing
+   the covariance if the Jacobian is rank deficient.
 
 .. member:: int Covariance::Options::min_reciprocal_condition_number
 
@@ -2009,89 +2054,79 @@
                   -2.0471e+14   2.0471e+14
                   \end{bmatrix}
 
-   This is not a useful result.
 
-   The reciprocal condition number of a matrix is a measure of
-   ill-conditioning or how close the matrix is to being singular/rank
-   deficient. It is defined as the ratio of the smallest eigenvalue of
-   the matrix to the largest eigenvalue. In the above case the
-   reciprocal condition number is about :math:`10^{-16}`. Which is
-   close to machine precision and even though the inverse exists, it
-   is meaningless, and care should be taken to interpet the results of
-   such an inversion.
+   This is not a useful result. Therefore, by default
+   :func:`Covariance::Compute` will return ``false`` if a rank
+   deficient Jacobian is encountered. How rank deficiency is detected
+   depends on the algorithm being used.
 
-   Matrices with condition number lower than
-   ``min_reciprocal_condition_number`` are considered rank deficient
-   and by default Covariance::Compute will return false if it
-   encounters such a matrix.
+   1. ``DENSE_SVD``
 
-   a. ``use_dense_linear_algebra = false``
+      .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}}  < \sqrt{\text{min_reciprocal_condition_number}}
 
-      When performing large scale sparse covariance estimation,
-      computing the exact value of the reciprocal condition number is
-      not possible as it would require computing the eigenvalues of
-      :math:`J'J`.
+      where :math:`\sigma_{\text{min}}` and
+      :math:`\sigma_{\text{max}}` are the minimum and maxiumum
+      singular values of :math:`J` respectively.
 
-      In this case we use cholmod_rcond, which uses the ratio of the
-      smallest to the largest diagonal entries of the Cholesky
-      factorization as an approximation to the reciprocal condition
-      number.
+    2. ``SPARSE_CHOLESKY``
 
+       .. math::  \text{cholmod_rcond} < \text{min_reciprocal_conditioner_number}
 
-      However, care must be taken as this is a heuristic and can
-      sometimes be a very crude estimate. The default value of
-      ``min_reciprocal_condition_number`` has been set to a conservative
-      value, and sometimes the ``Covariance::Compute`` may return false
-      even if it is possible to estimate the covariance reliably. In
-      such cases, the user should exercise their judgement before
-      lowering the value of ``min_reciprocal_condition_number``.
+      Here cholmod_rcond is a crude estimate of the reciprocal
+      condition number of :math:`J^\top J` by using the maximum and
+      minimum diagonal entries of the Cholesky factor :math:`R`. There
+      are no theoretical guarantees associated with this test. It can
+      give false positives and negatives. Use at your own risk. The
+      default value of ``min_reciprocal_condition_number`` has been
+      set to a conservative value, and sometimes the
+      :func:`Covariance::Compute` may return false even if it is
+      possible to estimate the covariance reliably. In such cases, the
+      user should exercise their judgement before lowering the value
+      of ``min_reciprocal_condition_number``.
 
-   b. ``use_dense_linear_algebra = true``
+    3. ``SPARSE_QR``
 
-      When using dense linear algebra, the user has more control in
-      dealing with singular and near singular covariance matrices.
+       .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
 
-      As mentioned above, when the covariance matrix is near singular,
-      instead of computing the inverse of :math:`J'J`, the
-      Moore-Penrose pseudoinverse of :math:`J'J` should be computed.
-
-      If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
-      e_i)`, where :math:`lambda_i` is the :math:`i^\textrm{th}`
-      eigenvalue and :math:`e_i` is the corresponding eigenvector,
-      then the inverse of :math:`J'J` is
-
-      .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
-
-      and computing the pseudo inverse involves dropping terms from
-      this sum that correspond to small eigenvalues.
-
-      How terms are dropped is controlled by
-      `min_reciprocal_condition_number` and `null_space_rank`.
-
-      If `null_space_rank` is non-negative, then the smallest
-      `null_space_rank` eigenvalue/eigenvectors are dropped
-      irrespective of the magnitude of :math:`\lambda_i`. If the ratio
-      of the smallest non-zero eigenvalue to the largest eigenvalue in
-      the truncated matrix is still below
-      min_reciprocal_condition_number, then the
-      `Covariance::Compute()` will fail and return `false`.
-
-      Setting `null_space_rank = -1` drops all terms for which
-
-      .. math::  \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
+       Here :\math:`\operatorname{rank}(J)` is the estimate of the
+       rank of `J` returned by the ``SuiteSparseQR`` algorithm. It is
+       a fairly reliable indication of rank deficiency.
 
 .. member:: int Covariance::Options::null_space_rank
 
-   Truncate the smallest ``null_space_rank`` eigenvectors when
-   computing the pseudo inverse of :math:`J'J`.
+    When using ``DENSE_SVD``, the user has more control in dealing
+    with singular and near singular covariance matrices.
 
-   If ``null_space_rank = -1``, then all eigenvectors with eigenvalues
-   s.t.
+    As mentioned above, when the covariance matrix is near singular,
+    instead of computing the inverse of :math:`J'J`, the Moore-Penrose
+    pseudoinverse of :math:`J'J` should be computed.
 
-   :math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
+    If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
+    e_i)`, where :math:`lambda_i` is the :math:`i^\textrm{th}`
+    eigenvalue and :math:`e_i` is the corresponding eigenvector, then
+    the inverse of :math:`J'J` is
 
-   are dropped. See the documentation for
-   ``min_reciprocal_condition_number`` for more details.
+    .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
+
+    and computing the pseudo inverse involves dropping terms from this
+    sum that correspond to small eigenvalues.
+
+    How terms are dropped is controlled by
+    `min_reciprocal_condition_number` and `null_space_rank`.
+
+    If `null_space_rank` is non-negative, then the smallest
+    `null_space_rank` eigenvalue/eigenvectors are dropped irrespective
+    of the magnitude of :math:`\lambda_i`. If the ratio of the
+    smallest non-zero eigenvalue to the largest eigenvalue in the
+    truncated matrix is still below min_reciprocal_condition_number,
+    then the `Covariance::Compute()` will fail and return `false`.
+
+    Setting `null_space_rank = -1` drops all terms for which
+
+    .. math::  \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
+
+    This option has no effect on ``SPARSE_QR`` and ``SPARSE_CHOLESKY``
+      algorithms.
 
 .. member:: bool Covariance::Options::apply_loss_function