Documentation update.

Change-Id: I0a3c5ae4bc981a8f5bdd5a8905f923dc5f09a024
diff --git a/docs/modeling.tex b/docs/modeling.tex
index 5cd199c..82080b7 100644
--- a/docs/modeling.tex
+++ b/docs/modeling.tex
@@ -2,8 +2,8 @@
 \chapter{Modeling}
 \label{chapter:api}
 \section{\texttt{CostFunction}}
-Given parameter blocks $\left[x_{i_1}, \hdots , x_{i_k}\right]$, a \texttt{CostFunction} is responsible for computing 
-a vector of residuals and if asked a vector of Jacobian matrices, i.e., given $\left[x_{i_1}, \hdots , x_{i_k}\right]$, compute the vector $f_i\left(x_{i_1},\hdots,x_{k_i}\right)$ and the matrices 
+Given parameter blocks $\left[x_{i_1}, \hdots , x_{i_k}\right]$, a \texttt{CostFunction} is responsible for computing
+a vector of residuals and if asked a vector of Jacobian matrices, i.e., given $\left[x_{i_1}, \hdots , x_{i_k}\right]$, compute the vector $f_i\left(x_{i_1},\hdots,x_{k_i}\right)$ and the matrices
 \begin{equation}
 J_{ij} = \frac{\partial}{\partial x_{i_j}}f_i\left(x_{i_1},\hdots,x_{k_i}\right),\quad \forall j = i_1,\hdots, i_k
 \end{equation}
@@ -13,7 +13,7 @@
   virtual bool Evaluate(double const* const* parameters,
                         double* residuals,
                         double** jacobians) = 0;
-  const vector<int16>& parameter_block_sizes(); 
+  const vector<int16>& parameter_block_sizes();
   int num_residuals() const;
 
  protected:
@@ -26,18 +26,18 @@
 is stored in \texttt{parameter\_block\_sizes\_} and \texttt{num\_residuals\_} respectively. User
 code inheriting from this class is expected to set these two members with the
 corresponding accessors. This information will be verified by the Problem
-when added with \texttt{Problem::AddResidualBlock}. 
+when added with \texttt{Problem::AddResidualBlock}.
 
 The most important method here is \texttt{Evaluate}. It implements the residual and Jacobian computation.
 
-\texttt{parameters}  is an array of pointers to arrays containing the various parameter blocks. parameters has the same number of elements as parameter\_block\_sizes\_.  Parameter blocks are in the same order as parameter\_block\_sizes\_. 
+\texttt{parameters}  is an array of pointers to arrays containing the various parameter blocks. parameters has the same number of elements as parameter\_block\_sizes\_.  Parameter blocks are in the same order as parameter\_block\_sizes\_.
 
 
-\texttt{residuals} is an array of size \texttt{num\_residuals\_}. 
+\texttt{residuals} is an array of size \texttt{num\_residuals\_}.
 
 
-\texttt{jacobians} is an array of size \texttt{parameter\_block\_sizes\_} containing pointers to storage for Jacobian matrices corresponding to each parameter block. The Jacobian matrices are in the same order as \texttt{parameter\_block\_sizes\_}. \texttt{jacobians[i]} is an array that contains \texttt{num\_residuals\_} $\times$ \texttt{parameter\_block\_sizes\_[i]} elements. Each Jacobian matrix is stored in row-major order, i.e., 
- 
+\texttt{jacobians} is an array of size \texttt{parameter\_block\_sizes\_} containing pointers to storage for Jacobian matrices corresponding to each parameter block. The Jacobian matrices are in the same order as \texttt{parameter\_block\_sizes\_}. \texttt{jacobians[i]} is an array that contains \texttt{num\_residuals\_} $\times$ \texttt{parameter\_block\_sizes\_[i]} elements. Each Jacobian matrix is stored in row-major order, i.e.,
+
 \begin{equation}
 \texttt{jacobians[i][r * parameter\_block\_size\_[i] + c]} =
 %\frac{\partial}{\partial x_{ic}}  f_{r}\left(x_{1},\hdots, x_{k}\right)
@@ -98,7 +98,7 @@
   double k_;
 };
 \end{minted}
- 
+
 Note that in the declaration of \texttt{operator()} the input parameters \texttt{x} and \texttt{y} come
  first, and are passed as const pointers to arrays of \texttt{T}. If there were three
  input parameters, then the third input parameter would come after \texttt{y}. The
@@ -136,7 +136,7 @@
  set the template parameters to (dimension of residual, number of parameters)
  instead of passing a dimension parameter for {\em every parameter block}. In the
  example above, that would be \texttt{<MyScalarCostFunction, 1, 2>}, which is missing
- the 2 as the last template argument. 
+ the 2 as the last template argument.
 
 \subsection{Theory \& Implementation}
 TBD
@@ -206,7 +206,7 @@
  public:
   virtual void Evaluate(double s, double out[3]) const = 0;
 };
-\end{minted}	
+\end{minted}
 
 The key method is \texttt{Evaluate}, which given a non-negative scalar \texttt{s}, computes
 \begin{align}
@@ -227,7 +227,7 @@
 \subsection{Scaling}
 Given one robustifier $\rho(s)$
  one can change the length scale at which robustification takes
- place, by adding a scale factor $a > 0$ which gives us $\rho(s,a) = a^2 \rho(s / a^2)$ and the first and second derivatives as $\rho'(s / a^2)$ and   $(1 / a^2) \rho''(s / a^2)$ respectively. 
+ place, by adding a scale factor $a > 0$ which gives us $\rho(s,a) = a^2 \rho(s / a^2)$ and the first and second derivatives as $\rho'(s / a^2)$ and   $(1 / a^2) \rho''(s / a^2)$ respectively.
 
 
 \begin{figure}[hbt]
@@ -238,7 +238,7 @@
 
 
 The reason for the appearance of squaring is that $a$ is in the units of the residual vector norm whereas $s$ is a squared norm. For applications it is more convenient to specify $a$ than
-its square. 
+its square.
 
 Here are some common loss functions implemented in Ceres. For simplicity we described their unscaled versions. Figure~\ref{fig:loss} illustrates their shape graphically.
 
@@ -252,8 +252,36 @@
 		\rho(s) &= \log(1 + s) \tag{\texttt{CauchyLoss}}
 \end{align}
 
+Ceres includes a number of other loss functions, the descriptions and
+documentation for which can be found in \texttt{loss\_function.h}.
+
 \subsection{Theory \& Implementation}
-TBD
+Let us consider a problem with a single problem and a single parameter
+block.
+\begin{align}
+\min_x \frac{1}{2}\rho(f^2(x))
+\end{align}
+
+Then, the robustified gradient and the Gauss-Newton Hessian are
+\begin{align}
+	g(x) &= \rho'J^\top(x)f(x)\\
+	H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x) 
+\end{align}
+where the terms involving the second derivatives of $f(x)$ have been ignored. Note that $H(x)$ is indefinite if $\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0$. If this is not the case, then its possible to re-weight the residual and the Jacobian matrix such that the corresponding linear least squares problem for the robustified Gauss-Newton step. 
+
+
+Let $\alpha$ be a root of 
+\begin{equation}
+	\frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\|f(x)\|^2 = 0.
+\end{equation}
+Then, define the rescaled residual and Jacobian as
+\begin{align}
+	\tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
+	\tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha \frac{f(x)f^\top(x)}{\left\|f(x)\right\|^2} \right)J(x)
+\end{align}
+In the case $2 \rho''\left\|f(x)\right\|^2 + \rho' \lesssim 0$, we limit $\alpha \le 1- \epsilon$ for some small $\epsilon$. For more details see Triggs et al~\cite{triggs-etal-1999}.
+
+With this simple rescaling, one can use any Jacobian based non-linear least squares algorithm to robustifed non-linear least squares problems.
 
 \section{\texttt{LocalParameterization}}
 Sometimes the parameters $x$ can overparameterize a problem. In
@@ -302,7 +330,7 @@
 };
 \end{minted}
 
-\texttt{GlobalSize} is the dimension of the ambient space in which the parameter block $x$ lives. \texttt{LocalSize} is the size of the tangent space that $\Delta x$ lives in. \texttt{Plus} implements $\boxplus(x,\Delta x)$ and $\texttt{ComputeJacobian}$ computes the Jacobian matrix 
+\texttt{GlobalSize} is the dimension of the ambient space in which the parameter block $x$ lives. \texttt{LocalSize} is the size of the tangent space that $\Delta x$ lives in. \texttt{Plus} implements $\boxplus(x,\Delta x)$ and $\texttt{ComputeJacobian}$ computes the Jacobian matrix
 \begin{equation}
 	J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right|_{\Delta x = 0}
 \end{equation}
@@ -334,9 +362,9 @@
 dimensional vector and define $\boxplus$ to be
 
 \begin{equation}
-  \boxplus(x, \Delta x) = 
+  \boxplus(x, \Delta x) =
 \left[
-\cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x 
+\cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x
 \right] * x
 \label{eq:quaternion}
 \end{equation}
@@ -413,13 +441,13 @@
    blocks if they are not present, so calling \texttt{AddParameterBlock}
    explicitly is not required.
 
-  
+
    \texttt{Problem} by default takes ownership of the
   \texttt{cost\_function} and \texttt{loss\_function pointers}. These objects remain
    live for the life of the \texttt{Problem} object. If the user wishes to
   keep control over the destruction of these objects, then they can
   do this by setting the corresponding enums in the \texttt{Options} struct.
-  
+
 
   Note that even though the Problem takes ownership of \texttt{cost\_function}
   and \texttt{loss\_function}, it does not preclude the user from re-using
@@ -430,9 +458,9 @@
 \texttt{AddParameterBlock} explicitly adds a parameter block to the \texttt{Problem}. Optionally it allows the user to associate a LocalParameterization object with the parameter block too. Repeated calls with the same arguments are ignored. Repeated
 calls with the same double pointer but a different size results in undefined behaviour.
 
-You can set any parameter block to be constant using 
+You can set any parameter block to be constant using
 
-\texttt{Problem::SetParameterBlockConstant} 
+\texttt{Problem::SetParameterBlockConstant}
 
 and undo this using
 
@@ -445,4 +473,4 @@
    live for the life of the \texttt{Problem} object. If the user wishes to
   keep control over the destruction of these objects, then they can
   do this by setting the corresponding enums in the \texttt{Options} struct. Even though \texttt{Problem} takes ownership of these pointers,  it does not preclude the user from re-using them in another residual or parameter block. The destructor takes care to call
-  delete on each  pointer only once.
\ No newline at end of file
+  delete on each  pointer only once.