More tips and tricks.

1. Add a tip about glog.
2. Add a tip about using Summary::FullReport to optimize performance.
3. Add a tip about using the Inverse Function Theorem.

Change-Id: I949ec6843ff796672edbad8bc801230dd0ab5345
diff --git a/docs/source/tricks.rst b/docs/source/tricks.rst
index 2622e23..32d0499 100644
--- a/docs/source/tricks.rst
+++ b/docs/source/tricks.rst
@@ -37,10 +37,182 @@
    :class:`NumericDiffFunctor` and :class:`CostFunctionToFunctor`.
 
 
-2. Diagnosing convergence issues.
+2. Use `google-glog <http://code.google.com/p/google-glog>`_.
 
-   TBD
+   Ceres has extensive support for logging various stages of the
+   solve. This includes detailed information about memory allocations
+   and time consumed in various parts of the solve, internal error
+   conditions etc. This logging structure is built on top of the
+   `google-glog <http://code.google.com/p/google-glog>`_ library and
+   can easily be controlled from the command line.
 
-3. Diagnoising performance issues.
+   We use it extensively to observe and analyze Ceres's
+   performance. Starting with ``-logtostdterr`` you can add ``-v=N``
+   for increasing values of N to get more and more verbose and
+   detailed information about Ceres internals.
 
-   TBD
+   Building Ceres like this introduces an external dependency, and it
+   is tempting instead to use the `miniglog` implementation that ships
+   inside Ceres instead. This is a bad idea.
+
+   ``miniglog`` was written primarily for building and using Ceres on
+   Android because the current version of `google-glog
+   <http://code.google.com/p/google-glog>`_ does not build using the
+   NDK. It has worse performance than the full fledged glog library
+   and is much harder to control and use.
+
+3. `Solver::Summary::FullReport` is your friend.
+
+   When diagnosing Ceres performance issues - runtime and convergence,
+   the first place to start is by looking at the output of
+   ``Solver::Summary::FullReport``. Here is an example
+
+   .. code-block:: bash
+
+     ./bin/bundle_adjuster --input ../data/problem-16-22106-pre.txt
+
+
+     0: f: 4.185660e+06 d: 0.00e+00 g: 2.16e+07 h: 0.00e+00 rho: 0.00e+00 mu: 1.00e+04 li:  0 it: 9.20e-02 tt: 3.35e-01
+     1: f: 1.980525e+05 d: 3.99e+06 g: 5.34e+06 h: 2.40e+03 rho: 9.60e-01 mu: 3.00e+04 li:  1 it: 1.99e-01 tt: 5.34e-01
+     2: f: 5.086543e+04 d: 1.47e+05 g: 2.11e+06 h: 1.01e+03 rho: 8.22e-01 mu: 4.09e+04 li:  1 it: 1.61e-01 tt: 6.95e-01
+     3: f: 1.859667e+04 d: 3.23e+04 g: 2.87e+05 h: 2.64e+02 rho: 9.85e-01 mu: 1.23e+05 li:  1 it: 1.63e-01 tt: 8.58e-01
+     4: f: 1.803857e+04 d: 5.58e+02 g: 2.69e+04 h: 8.66e+01 rho: 9.93e-01 mu: 3.69e+05 li:  1 it: 1.62e-01 tt: 1.02e+00
+     5: f: 1.803391e+04 d: 4.66e+00 g: 3.11e+02 h: 1.02e+01 rho: 1.00e+00 mu: 1.11e+06 li:  1 it: 1.61e-01 tt: 1.18e+00
+
+     Ceres Solver Report
+     -------------------
+                                          Original                  Reduced
+     Parameter blocks                        22122                    22122
+     Parameters                              66462                    66462
+     Residual blocks                         83718                    83718
+     Residual                               167436                   167436
+
+     Minimizer                        TRUST_REGION
+
+     Sparse linear algebra library    SUITE_SPARSE
+     Trust region strategy     LEVENBERG_MARQUARDT
+
+                                              Given                     Used
+     Linear solver                    SPARSE_SCHUR             SPARSE_SCHUR
+     Threads                                     1                        1
+     Linear solver threads                       1                        1
+     Linear solver ordering              AUTOMATIC                22106, 16
+
+     Cost:
+     Initial                          4.185660e+06
+     Final                            1.803391e+04
+     Change                           4.167626e+06
+
+     Minimizer iterations                        5
+     Successful steps                            5
+     Unsuccessful steps                          0
+
+     Time (in seconds):
+     Preprocessor                            0.243
+
+       Residual evaluation                   0.053
+       Jacobian evaluation                   0.435
+       Linear solver                         0.371
+     Minimizer                               0.940
+
+     Postprocessor                           0.002
+     Total                                   1.221
+
+     Termination:                   NO_CONVERGENCE (Maximum number of iterations reached.)
+
+  Let us focus on run-time performance. The relevant lines to look at
+  are
+
+
+   .. code-block:: bash
+
+     Time (in seconds):
+     Preprocessor                            0.243
+
+       Residual evaluation                   0.053
+       Jacobian evaluation                   0.435
+       Linear solver                         0.371
+     Minimizer                               0.940
+
+     Postprocessor                           0.002
+     Total                                   1.221
+
+  Which tell us that of the total 1.2 seconds, about .4 seconds was
+  spent in the linear solver and the rest was mostly spent in
+  preprocessing and jacobian evaluation.
+
+  The preprocessing seems particularly expensive. Looking back at the
+  report, we observe
+
+   .. code-block:: bash
+
+     Linear solver ordering              AUTOMATIC                22106, 16
+
+  Which indicates that we are using automatic ordering for the
+  ``SPARSE_SCHUR`` solver. This can be expensive at times. A straight
+  forward way to deal with this is to give the ordering manually. For
+  ``bundle_adjuster`` this can be done by passing the flag
+  ``-ordering=user``. Doing so and looking at the timing block of the
+  full report gives us
+
+   .. code-block:: bash
+
+     Time (in seconds):
+     Preprocessor                            0.058
+
+       Residual evaluation                   0.050
+       Jacobian evaluation                   0.416
+       Linear solver                         0.360
+     Minimizer                               0.903
+
+     Postprocessor                           0.002
+     Total                                   0.998
+
+  The preprocessor time has gone down by more than 4x!.
+
+
+4. Putting `Inverse Function Theorem
+   <http://en.wikipedia.org/wiki/Inverse_function_theorem>`_ to use.
+
+   Every now and then we have to deal with functions which cannot be
+   evaluated analytically. Computing the Jacobian in such cases is
+   tricky. A particularly interesting case is where the inverse of the
+   function is easy to compute analytically. An example of such a
+   function is the Coordinate transformation between the `ECEF
+   <http://en.wikipedia.org/wiki/ECEF>`_ and the `WGS84
+   <http://en.wikipedia.org/wiki/World_Geodetic_System>`_ where the
+   conversion from WGS84 to ECEF is analytic, but the conversion back
+   to ECEF uses an iterative algorithm. So how do you compute the
+   derivative of the ECEF to WGS84 transformation?
+
+   One obvious approach would be to numerically
+   differentiate the conversion function. This is not a good idea. For
+   one, it will be slow, but it will also be numerically quite
+   bad.
+
+   Turns out you can use the `Inverse Function Theorem
+   <http://en.wikipedia.org/wiki/Inverse_function_theorem>`_ in this
+   case to compute the derivatives more or less analytically.
+
+   The key result here is. If :math:`x = f^{-1}(y)`, and :math:`Df(x)`
+   is the invertible Jacobian of :math:`f` at :math:`x`. Then the
+   Jacobian :math:`Df^{-1}(y) = [Df(x)]^{-1}`, i.e., the Jacobian of
+   the :math:`f^{-1}` is the inverse of the Jacobian of :math:`f`.
+
+   Algorithmically this means that given :math:`y`, compute :math:`x =
+   f^{-1}(y)` by whatever means you can. Evaluate the Jacobian of
+   :math:`f` at :math:`x`. If the Jacobian matrix is invertible, then
+   the inverse is the Jacobian of the inverse at :math:`y`.
+
+   One can put this into practice with the following code fragment.
+
+   .. code-block:: c++
+
+      Eigen::Vector3d ecef; // Fill some values
+      // Iterative computation.
+      Eigen::Vector3d lla = ECEFToLLA(ecef);
+      // Analytic derivatives
+      Eigen::Matrix3d lla_to_ecef_jacobian = LLAToECEFJacobian(lla);
+      bool invertible;
+      Eigen::Matrix3d ecef_to_lla_jacobian;
+      lla_to_ecef_jacobian.computeInverseWithCheck(ecef_to_lla_jacobian, invertible);