More tips and tricks.
1. Add a tip about glog.
2. Add a tip about using Summary::FullReport to optimize performance.
3. Add a tip about using the Inverse Function Theorem.
Change-Id: I949ec6843ff796672edbad8bc801230dd0ab5345
diff --git a/docs/source/tricks.rst b/docs/source/tricks.rst
index 2622e23..32d0499 100644
--- a/docs/source/tricks.rst
+++ b/docs/source/tricks.rst
@@ -37,10 +37,182 @@
:class:`NumericDiffFunctor` and :class:`CostFunctionToFunctor`.
-2. Diagnosing convergence issues.
+2. Use `google-glog <http://code.google.com/p/google-glog>`_.
- TBD
+ Ceres has extensive support for logging various stages of the
+ solve. This includes detailed information about memory allocations
+ and time consumed in various parts of the solve, internal error
+ conditions etc. This logging structure is built on top of the
+ `google-glog <http://code.google.com/p/google-glog>`_ library and
+ can easily be controlled from the command line.
-3. Diagnoising performance issues.
+ We use it extensively to observe and analyze Ceres's
+ performance. Starting with ``-logtostdterr`` you can add ``-v=N``
+ for increasing values of N to get more and more verbose and
+ detailed information about Ceres internals.
- TBD
+ Building Ceres like this introduces an external dependency, and it
+ is tempting instead to use the `miniglog` implementation that ships
+ inside Ceres instead. This is a bad idea.
+
+ ``miniglog`` was written primarily for building and using Ceres on
+ Android because the current version of `google-glog
+ <http://code.google.com/p/google-glog>`_ does not build using the
+ NDK. It has worse performance than the full fledged glog library
+ and is much harder to control and use.
+
+3. `Solver::Summary::FullReport` is your friend.
+
+ When diagnosing Ceres performance issues - runtime and convergence,
+ the first place to start is by looking at the output of
+ ``Solver::Summary::FullReport``. Here is an example
+
+ .. code-block:: bash
+
+ ./bin/bundle_adjuster --input ../data/problem-16-22106-pre.txt
+
+
+ 0: f: 4.185660e+06 d: 0.00e+00 g: 2.16e+07 h: 0.00e+00 rho: 0.00e+00 mu: 1.00e+04 li: 0 it: 9.20e-02 tt: 3.35e-01
+ 1: f: 1.980525e+05 d: 3.99e+06 g: 5.34e+06 h: 2.40e+03 rho: 9.60e-01 mu: 3.00e+04 li: 1 it: 1.99e-01 tt: 5.34e-01
+ 2: f: 5.086543e+04 d: 1.47e+05 g: 2.11e+06 h: 1.01e+03 rho: 8.22e-01 mu: 4.09e+04 li: 1 it: 1.61e-01 tt: 6.95e-01
+ 3: f: 1.859667e+04 d: 3.23e+04 g: 2.87e+05 h: 2.64e+02 rho: 9.85e-01 mu: 1.23e+05 li: 1 it: 1.63e-01 tt: 8.58e-01
+ 4: f: 1.803857e+04 d: 5.58e+02 g: 2.69e+04 h: 8.66e+01 rho: 9.93e-01 mu: 3.69e+05 li: 1 it: 1.62e-01 tt: 1.02e+00
+ 5: f: 1.803391e+04 d: 4.66e+00 g: 3.11e+02 h: 1.02e+01 rho: 1.00e+00 mu: 1.11e+06 li: 1 it: 1.61e-01 tt: 1.18e+00
+
+ Ceres Solver Report
+ -------------------
+ Original Reduced
+ Parameter blocks 22122 22122
+ Parameters 66462 66462
+ Residual blocks 83718 83718
+ Residual 167436 167436
+
+ Minimizer TRUST_REGION
+
+ Sparse linear algebra library SUITE_SPARSE
+ Trust region strategy LEVENBERG_MARQUARDT
+
+ Given Used
+ Linear solver SPARSE_SCHUR SPARSE_SCHUR
+ Threads 1 1
+ Linear solver threads 1 1
+ Linear solver ordering AUTOMATIC 22106, 16
+
+ Cost:
+ Initial 4.185660e+06
+ Final 1.803391e+04
+ Change 4.167626e+06
+
+ Minimizer iterations 5
+ Successful steps 5
+ Unsuccessful steps 0
+
+ Time (in seconds):
+ Preprocessor 0.243
+
+ Residual evaluation 0.053
+ Jacobian evaluation 0.435
+ Linear solver 0.371
+ Minimizer 0.940
+
+ Postprocessor 0.002
+ Total 1.221
+
+ Termination: NO_CONVERGENCE (Maximum number of iterations reached.)
+
+ Let us focus on run-time performance. The relevant lines to look at
+ are
+
+
+ .. code-block:: bash
+
+ Time (in seconds):
+ Preprocessor 0.243
+
+ Residual evaluation 0.053
+ Jacobian evaluation 0.435
+ Linear solver 0.371
+ Minimizer 0.940
+
+ Postprocessor 0.002
+ Total 1.221
+
+ Which tell us that of the total 1.2 seconds, about .4 seconds was
+ spent in the linear solver and the rest was mostly spent in
+ preprocessing and jacobian evaluation.
+
+ The preprocessing seems particularly expensive. Looking back at the
+ report, we observe
+
+ .. code-block:: bash
+
+ Linear solver ordering AUTOMATIC 22106, 16
+
+ Which indicates that we are using automatic ordering for the
+ ``SPARSE_SCHUR`` solver. This can be expensive at times. A straight
+ forward way to deal with this is to give the ordering manually. For
+ ``bundle_adjuster`` this can be done by passing the flag
+ ``-ordering=user``. Doing so and looking at the timing block of the
+ full report gives us
+
+ .. code-block:: bash
+
+ Time (in seconds):
+ Preprocessor 0.058
+
+ Residual evaluation 0.050
+ Jacobian evaluation 0.416
+ Linear solver 0.360
+ Minimizer 0.903
+
+ Postprocessor 0.002
+ Total 0.998
+
+ The preprocessor time has gone down by more than 4x!.
+
+
+4. Putting `Inverse Function Theorem
+ <http://en.wikipedia.org/wiki/Inverse_function_theorem>`_ to use.
+
+ Every now and then we have to deal with functions which cannot be
+ evaluated analytically. Computing the Jacobian in such cases is
+ tricky. A particularly interesting case is where the inverse of the
+ function is easy to compute analytically. An example of such a
+ function is the Coordinate transformation between the `ECEF
+ <http://en.wikipedia.org/wiki/ECEF>`_ and the `WGS84
+ <http://en.wikipedia.org/wiki/World_Geodetic_System>`_ where the
+ conversion from WGS84 to ECEF is analytic, but the conversion back
+ to ECEF uses an iterative algorithm. So how do you compute the
+ derivative of the ECEF to WGS84 transformation?
+
+ One obvious approach would be to numerically
+ differentiate the conversion function. This is not a good idea. For
+ one, it will be slow, but it will also be numerically quite
+ bad.
+
+ Turns out you can use the `Inverse Function Theorem
+ <http://en.wikipedia.org/wiki/Inverse_function_theorem>`_ in this
+ case to compute the derivatives more or less analytically.
+
+ The key result here is. If :math:`x = f^{-1}(y)`, and :math:`Df(x)`
+ is the invertible Jacobian of :math:`f` at :math:`x`. Then the
+ Jacobian :math:`Df^{-1}(y) = [Df(x)]^{-1}`, i.e., the Jacobian of
+ the :math:`f^{-1}` is the inverse of the Jacobian of :math:`f`.
+
+ Algorithmically this means that given :math:`y`, compute :math:`x =
+ f^{-1}(y)` by whatever means you can. Evaluate the Jacobian of
+ :math:`f` at :math:`x`. If the Jacobian matrix is invertible, then
+ the inverse is the Jacobian of the inverse at :math:`y`.
+
+ One can put this into practice with the following code fragment.
+
+ .. code-block:: c++
+
+ Eigen::Vector3d ecef; // Fill some values
+ // Iterative computation.
+ Eigen::Vector3d lla = ECEFToLLA(ecef);
+ // Analytic derivatives
+ Eigen::Matrix3d lla_to_ecef_jacobian = LLAToECEFJacobian(lla);
+ bool invertible;
+ Eigen::Matrix3d ecef_to_lla_jacobian;
+ lla_to_ecef_jacobian.computeInverseWithCheck(ecef_to_lla_jacobian, invertible);