Documentation & Logging cleanups.

1. Document the use of dogleg and a general discussion of
   trust region methods.
2. Added a TBD section on compiler/linker flags.
3. Summary::FullReport now prints out sparse_linear_algebra_library
   and trust_region_strategy_type.

Change-Id: I01f680070d510715900f345364855689005d54bb
diff --git a/.gitignore b/.gitignore
index c1c6c08..ddbb5d2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -5,6 +5,8 @@
 *.bak
 *.orig
 build/
+build-release/
+build-debug/
 *.aux
 *.blg
 *.toc
diff --git a/docs/build.tex b/docs/build.tex
index f329050..6ed4991 100644
--- a/docs/build.tex
+++ b/docs/build.tex
@@ -90,10 +90,10 @@
 We are now ready to build and test Ceres. Note that \texttt{cmake} requires the exact path to the \texttt{libglog.a} and \texttt{libgflag.a}
 
 \begin{minted}{bash}
-tar zxf ceres-solver-1.0.tar.gz
+tar zxf ceres-solver-1.2.1.tar.gz
 mkdir ceres-bin
 cd ceres-bin
-cmake ../ceres-solver-1.0
+cmake ../ceres-solver-1.2.1
 make -j3
 make test
 \end{minted}
@@ -102,7 +102,7 @@
 included problems, which comes from the University of Washington's BAL dataset~\cite{Agarwal10bal}:
 \begin{minted}{bash}
 examples/simple_bundle_adjuster \
-  ../ceres-solver-1.0/data/problem-16-22106-pre.txt \
+  ../ceres-solver-1.2.1/data/problem-16-22106-pre.txt \
 \end{minted}
 This runs Ceres for a maximum of 10 iterations using the  \denseschur\ linear solver. The output should look something like this.
 \clearpage
@@ -178,14 +178,21 @@
 
 We are now ready to build and test Ceres.
 \begin{minted}{bash}
-tar zxf ceres-solver-1.0.tar.gz
+tar zxf ceres-solver-1.2.1.tar.gz
 mkdir ceres-bin
 cd ceres-bin
-cmake ../ceres-solver-1.0
+cmake ../ceres-solver-1.2.1
 make -j3
 make test
 \end{minted}
 Like the Linux build, you should now be able to run \texttt{examples/simple\_bundle\_adjuster}.
+
+
+\section{Compiler Flags to use when building your own applications}
+\label{sec:compiler-flags}
+TBD
+
+
 \section{Customizing the Build Process}
 \label{sec:custom}
 It is possible to reduce the libraries needed to build Ceres and
diff --git a/docs/ceres-solver.tex b/docs/ceres-solver.tex
index 994fb29..ca839ea 100644
--- a/docs/ceres-solver.tex
+++ b/docs/ceres-solver.tex
@@ -8,6 +8,8 @@
 \usepackage[pdftex]{graphicx}
 \usepackage[sort&compress]{natbib}
 \usepackage[breaklinks=true,letterpaper=true,colorlinks,bookmarks=false]{hyperref}
+\usepackage{algorithm}
+\usepackage{algorithmic}
 
 % page dimensions
 \addtolength{\textwidth}{1.5in}
diff --git a/docs/changes.tex b/docs/changes.tex
index 7fca3ee..4daaeee 100644
--- a/docs/changes.tex
+++ b/docs/changes.tex
@@ -1,6 +1,21 @@
 %!TEX root = ceres-solver.tex
 
 \chapter{Version History}
+\section*{1.2.1}
+\subsection{New Features}
+\begin{itemize}
+	\item Powell's Dogleg solver
+	\item Documentation now has a brief overview of Trust Region methods and how the Levenberg-Marquardt and Dogleg methods work.
+\end{itemize}
+\subsection{Bug Fixes}
+\begin{itemize}
+	\item Destructor for TrustRegionStrategy was not virtual (Thanks Markus Moll)
+	\item Invalid DCHECK in suitesparse.cc (Thanks Markus Moll)
+	\item Logging level changes in ConjugateGradientsSolver
+	\item VisibilityBasedPreconditioner setup does not account for skipped camera pairs. This was debugging code.
+	\item Enable SSE support on MacOS
+	\item \texttt{system\_test} was taking too long and too much memory (Thanks Koichi Akabe)
+\end{itemize}
 \section*{1.2.0}
 \subsection{New Features}
 \begin{itemize}
diff --git a/docs/helloworld.tex b/docs/helloworld.tex
index e0d5f1d..0a36229 100644
--- a/docs/helloworld.tex
+++ b/docs/helloworld.tex
@@ -11,7 +11,6 @@
 Let us write this problem as a non-linear least squares problem by defining the scalar residual function $f_1(x) = 10 - x$. Then $F(x) = [f_1(x)]$ is a residual vector with exactly one component.
 
 When solving a problem with Ceres, the first thing to do is to define a subclass of \texttt{CostFunction}. It is responsible for computing the value of the residual function and its derivative (also known as the Jacobian) with respect to $x$.
-
 \begin{minted}[mathescape]{c++}
 class SimpleCostFunction
   : public ceres::SizedCostFunction<1 /* number of residuals */,
@@ -31,7 +30,6 @@
   }
 };
 \end{minted}
-
 \texttt{SimpleCostFunction} is provided with an input array of parameters, an output array for residuals and an optional output array for Jacobians. In our example, there is just one parameter and one residual and this is known at compile time, therefore we can save some code and instead of inheriting from \texttt{CostFunction}, we can instaed inherit from the templated \texttt{SizedCostFunction} class. 
 
 
diff --git a/docs/nnlsq.tex b/docs/nnlsq.tex
index 5de8131..02d6b9e 100644
--- a/docs/nnlsq.tex
+++ b/docs/nnlsq.tex
@@ -8,7 +8,7 @@
 \begin{equation}
 	\arg \min_x \frac{1}{2} \sum_{i=1}^k \|f_i(x)\|^2.
 \end{equation}
-is a Non-linear Least Squares problem~\footnote{Ceres can solve a more general version of this problem, but for pedagogical reasons, we will restrict ourselves to this class of problems for now. See section~\ref{chapter:overview} for a full description of the problems that Ceres can solve}. Here $\|\cdot\|$ denotes the Euclidean norm of a vector. 
+is a Non-linear least squares problem~\footnote{Ceres can solve a more general version of this problem, but for pedagogical reasons, we will restrict ourselves to this class of problems for now. See section~\ref{chapter:overview} for a full description of the problems that Ceres can solve}. Here $\|\cdot\|$ denotes the Euclidean norm of a vector. 
 
 Such optimization problems arise in almost every area of science and engineering. Whenever there is data to be analyzed, curves to be fitted, there is usually a linear or a non-linear least squares problem lurking in there somewhere. 
 
diff --git a/docs/solving.tex b/docs/solving.tex
index ddc7bae..e6bee4a 100644
--- a/docs/solving.tex
+++ b/docs/solving.tex
@@ -1,35 +1,66 @@
 %!TEX root = ceres-solver.tex
 \chapter{Solving}
-Effective use of Ceres requires some familiarity with the basic components of a nonlinear least squares solver. 
+Effective use of Ceres requires some familiarity with the basic components of a nonlinear least squares solver, so before we describe how to configure the solver, we will begin by taking a brief look at how some of the core optimization algorithms in Ceres work and the various linear solvers and preconditioners that power it.
 
-The Levenberg-Marquardt algorithm~\cite{levenberg1944method, marquardt1963algorithm} is the most popular algorithm for solving non-linear least squares problems. Ceres implements an exact step~\cite{madsen2004methods} and an inexact step variant of the Levenberg-Marquardt algorithm~\cite{wright1985inexact,nash1990assessing}.  Before we describe how to configure the solver, we will begin by taking a brief look at how the Levenberg-Marquardt algorithm works and the various linear solvers and preconditioners that power it.
-
-
-\section{Levenberg-Marquardt Algorithm}
+\section{Trust Region Methods}
 Let $x \in \mathbb{R}^{n}$ be an $n$-dimensional vector of variables, and
 $ F(x) = \left[f_1(x),   \hdots,  f_{m}(x) \right]^{\top}$ be a $m$-dimensional function of $x$.  We are interested in solving the following optimization problem~\footnote{At the level of the non-linear solver, the block and residual structure is not relevant, therefore our discussion here is in terms of an optimization problem defined over a state vector of size $n$.},
 \begin{equation}
         \arg \min_x \frac{1}{2}\|F(x)\|^2\ .
         \label{eq:nonlinsq}
 \end{equation}
-The Jacobian $J(x)$ of $F(x)$ is an $m\times n$ matrix, where $J_{ij}(x) = \partial_j f_i(x)$  and the gradient vector $g(x) = \nabla  \frac{1}{2}\|F(x)\|^2 = J(x)^\top F(x)$. Since the efficient global optimization of~\eqref{eq:nonlinsq} for general $F(x)$ is an intractable problem, we will have to settle for finding a local minimum.
+Here, the Jacobian $J(x)$ of $F(x)$ is an $m\times n$ matrix, where $J_{ij}(x) = \partial_j f_i(x)$  and the gradient vector $g(x) = \nabla  \frac{1}{2}\|F(x)\|^2 = J(x)^\top F(x)$. Since the efficient global optimization of~\eqref{eq:nonlinsq} for general $F(x)$ is an intractable problem, we will have to settle for finding a local minimum.
 
 The general strategy when solving non-linear optimization problems is to solve a sequence of approximations to the original problem~\cite{nocedal2000numerical}. At each iteration, the approximation is solved to determine a correction $\Delta x$ to the vector $x$. For non-linear least squares, an approximation can be constructed by using the linearization $F(x+\Delta x) \approx F(x) + J(x)\Delta x$, which leads to the following linear least squares  problem:
 \begin{equation}
          \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2
         \label{eq:linearapprox}
 \end{equation}
-Unfortunately, na\"ively solving a sequence of these problems and updating $x \leftarrow x+ \Delta x$ leads to an algorithm that may not converge.  To get a convergent algorithm, we need to control the size of the step $\Delta x$.  One way to do this is to introduce a regularization term:
-\begin{equation}
-         \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 + \mu \|D(x)\Delta x\|^2\ .
-        \label{eq:lsqr}
-\end{equation}
-Here, $D(x)$ is a non-negative diagonal matrix, typically the square root of the diagonal of the matrix $J(x)^\top J(x)$ and $\mu$ is a non-negative parameter that controls the strength of regularization. It is straightforward to show that the step size $\|\Delta x\|$ is inversely related to $\mu$. Levenberg-Marquardt updates the value of $\mu$ at each step based on how well the Jacobian $J(x)$ approximates $F(x)$. The quality of this fit is measured by the ratio of  the actual decrease in the objective function to the decrease in the value of the linearized model $L(\Delta x) = \frac{1}{2}\|J(x)\Delta x + F(x)\|^2$.
-\begin{equation}
-\rho = \frac{\|F(x + \Delta x)\|^2 - \|F(x)\|^2}{\|J(x)\Delta x + F(x)\|^2 - \|F(x)\|^2}
-\end{equation}
+Unfortunately, na\"ively solving a sequence of these problems and updating $x \leftarrow x+ \Delta x$ leads to an algorithm that may not converge.  To get a convergent algorithm, we need to control the size of the step $\Delta x$. And this is where the idea of a trust-region comes in. The generic trust-region loop for non-linear least squares problems looks something like this
 
-If $\rho$ is large, that means the linear model is a good approximation to the non-linear model and it is worth trusting it more in the computation of $\Delta x$, so we decrease $\mu$. If $\rho$ is small, the linear model is a poor approximation and $\mu$ is increased. This kind of reasoning is the basis of Trust-region methods, of which Levenberg-Marquardt is an early example~\cite{nocedal2000numerical}.
+
+\begin{algorithmic}
+\REQUIRE Initial point $x$ and a trust region radius $\mu$.
+\LOOP
+\STATE{Solve $\arg \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2$ s.t. $\|D(x)\Delta x\|^2 \le \mu$}
+\STATE{$\rho = \frac{\displaystyle \|F(x + \Delta x)\|^2 - \|F(x)\|^2}{\displaystyle \|J(x)\Delta x + F(x)\|^2 - \|F(x)\|^2}$}
+\IF {$\rho > \epsilon$}
+\STATE{$x = x + \Delta x$}
+\ENDIF
+\IF {$\rho > \eta_1$}
+\STATE{$\rho = 2 * \rho$}
+\ELSE
+\IF {$\rho < \eta_2$}
+\STATE {$\rho = 0.5 * \rho$}
+\ENDIF
+\ENDIF
+\ENDLOOP
+\end{algorithmic}
+
+Here, $\mu$ is the trust region radius, $D(x)$ is some matrix used to define a metric on the domain of $F(x)$ and $\rho$ measures the quality of the step $\Delta x$, i.e., how well did the linear model predict the decrease in the value of the non-linear objective. The idea is to increase or decrease the radius of the trust region depending on how well the linearization predicts the behavior of the non-linear objective, which in turn is reflected in the value of $\rho$.
+
+The key computational step in a trust-region algorithm is the solution of the constrained optimization problem
+\begin{align}
+        \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\
+        \text{such that}&\quad  \|D(x)\Delta x\|^2 \le \mu
+\label{eq:trp}
+\end{align}
+
+There are a number of different ways of solving this problem, each giving rise to a different concrete trust-region algorithm. Currently Ceres, implements two trust-region algorithms - Levenberg-Marquardt and Powell's Dogleg.
+
+\subsection{Levenberg-Marquardt}
+The Levenberg-Marquardt algorithm~\cite{levenberg1944method, marquardt1963algorithm} is the most popular algorithm for solving non-linear least squares problems.  It was also the first trust region algorithm to be developed~\cite{levenberg1944method,marquardt1963algorithm}. Ceres implements an exact step~\cite{madsen2004methods} and an inexact step variant of the Levenberg-Marquardt algorithm~\cite{wright1985inexact,nash1990assessing}.
+
+It can be shown, that the solution to~\eqref{eq:trp} can be obtained by solving an unconstrained optimization of the form
+\begin{align}
+        \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 +\lambda  \|D(x)\Delta x\|^2
+\end{align}
+Where, $\lambda$ is a Lagrange multiplier that is inverse related to $\mu$. In Ceres, we solve for
+\begin{align}
+        \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 + \frac{1}{\mu} \|D(x)\Delta x\|^2
+\label{eq:lsqr}
+\end{align}
+The matrix $D(x)$ is a non-negative diagonal matrix, typically the square root of the diagonal of the matrix $J(x)^\top J(x)$.
 
 Before going further, let us make some notational simplifications. We will assume that the matrix $\sqrt{\mu} D$ has been concatenated at the bottom of the matrix $J$ and similarly a vector of zeros has been added to the bottom of the vector $f$ and the rest of our discussion will be in terms of $J$ and $f$, \ie the linear least squares problem.
 \begin{align}
@@ -46,15 +77,36 @@
 \end{equation}
 Here, $k$ indicates the Levenberg-Marquardt iteration number and $0 < \eta_k <1$ is known as the forcing sequence.  Wright \& Holt \cite{wright1985inexact} prove that a truncated Levenberg-Marquardt algorithm that uses an inexact Newton step based on~\eqref{eq:inexact} converges for any sequence $\eta_k \leq \eta_0 < 1$ and the rate of convergence depends on the choice of the forcing sequence $\eta_k$.
 
-Ceres supports both exact and inexact step solution strategies.
+Ceres supports both exact and inexact step solution strategies. When the user chooses a factorization based linear solver, the exact step Levenberg-Marquardt algorithm is used. When the user chooses an iterative linear solver, the inexact step Levenberg-Marquardt algorithm is used.
+
+\subsection{Powell's Dogleg}
+Another strategy for solving the trust region problem~\eqref{eq:trp} was introduced by M. J. D. Powell. The key idea there is to compute two vectors
+\begin{align}
+        \Delta x^{\text{Gauss-Newton}} &= \arg \min_{\Delta x}\frac{1}{2} \|J(x)\Delta x + f(x)\|^2.\\
+        \Delta x^{\text{Cauchy}} &= -\frac{\|g(x)\|^2}{\|J(x)g(x)\|^2}g(x).
+\end{align}
+Note that the vector $\Delta x^{\text{Gauss-Newton}}$ is the solution to~\eqref{eq:linearapprox} and $\Delta x^{\text{Cauchy}}$ is the vector that minimizes the linear approximation if we restrict ourselves to moving along the direction of the gradient.
+
+Then Powell's Dogleg method finds a vector $\Delta x$ in the two dimensional subspace defined by $\Delta x^{\text{Gauss-Newton}}$ and $\Delta x^{\text{Cauchy}}$ that solves the trust region problem. For more details on the exact reasoning and computations, please see Madsen et al~\cite{madsen2004methods}.
+
+The key advantage of the Dogleg over Levenberg Marquardt is that if the step computation for a particular choice of $\mu$ does not result in sufficient decrease in the value of the objective function, Levenberg-Marquardt solves the linear approximation from scratch with a small value of $\mu$. Dogleg on the other hand, only needs to compute the interpolation between the Gauss-Newton and the Cauchy vectors, as neither of them depend on the value of $\mu$.
+
+The Dogleg method can only be used with the exact factorization based linear solvers.
 
 \section{\texttt{LinearSolver}}
-Let $H(x)= J(x)^\top J(x)$ and $g(x) = -J(x)^\top  f(x)$. For notational convenience let us also drop the dependence on $x$. Then it is easy to see that solving~\eqref{eq:simple} is equivalent to solving the {\em normal equations}
+Recall that in both of the trust-region methods described above, the key computational cost is the solution of a linear least squares problem of the form
+\begin{align}
+ \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
+ \label{eq:simple2}
+\end{align}
+
+
+Let $H(x)= J(x)^\top J(x)$ and $g(x) = -J(x)^\top  f(x)$. For notational convenience let us also drop the dependence on $x$. Then it is easy to see that solving~\eqref{eq:simple2} is equivalent to solving the {\em normal equations}
 \begin{align}
 H \Delta x  &= g \label{eq:normal}
 \end{align}
 
-For all but the smallest problems the solution of~\eqref{eq:normal} in each iteration of the Levenberg-Marquardt algorithm is the dominant computational cost in Ceres. Ceres provides a number of different options for solving~\eqref{eq:normal}.
+Ceres provides a number of different options for solving~\eqref{eq:normal}.
 
 \subsection{\texttt{DENSE\_QR}}
 For small problems (a couple of hundred parameters and a few thousand residuals) with relatively dense Jacobians, \texttt{DENSE\_QR} is the method of choice~\cite{bjorck1996numerical}. Let $J = QR$ be the QR-decomposition of $J$, where $Q$ is an orthonormal matrix and $R$ is an upper triangular matrix~\cite{trefethen1997numerical}. Then it can be shown that the solution to~\eqref{eq:normal} is given by
@@ -146,7 +198,7 @@
 \end{align}
 When the user chooses \texttt{ITERATIVE\_SCHUR} as the linear solver, Ceres automatically switches from the exact step algorithm to an inexact step algorithm.
 
-%Currently only the \texttt{JACOBI} preconditioner is available for use with this solver. It uses the block diagonal of $H$ as a preconditioner. 
+%Currently only the \texttt{JACOBI} preconditioner is available for use with this solver. It uses the block diagonal of $H$ as a preconditioner.
 
 
 \subsection{\texttt{ITERATIVE\_SCHUR}}
@@ -171,6 +223,7 @@
 
 The computational cost of using a preconditioner $M$ is the cost of computing $M$ and evaluating the product $M^{-1}y$ for arbitrary vectors $y$. Thus, there are two competing factors to consider: How much of $H$'s structure is captured by $M$ so that the condition number $\kappa(HM^{-1})$ is low, and the computational cost of constructing and using $M$.  The ideal preconditioner would be one for which $\kappa(M^{-1}A) =1$. $M=A$ achieves this, but it is not a practical choice, as applying this preconditioner would require solving a linear system equivalent to the unpreconditioned problem.  It is usually the case that the more information $M$ has about $H$, the more expensive it is use. For example, Incomplete Cholesky factorization based preconditioners  have much better convergence behavior than the Jacobi preconditioner, but are also much more expensive.
 
+
 The simplest of all preconditioners is the diagonal or Jacobi preconditioner, \ie,  $M=\operatorname{diag}(A)$, which for block structured matrices like $H$ can be generalized to the block Jacobi preconditioner.
 
 For \texttt{ITERATIVE\_SCHUR} there are two obvious choices for block diagonal preconditioners for $S$. The block diagonal of the matrix $B$~\cite{mandel1990block} and the block diagonal $S$, \ie the block Jacobi preconditioner for $S$. Ceres's implements both of these preconditioners and refers to them as  \texttt{JACOBI} and \texttt{SCHUR\_JACOBI} respectively.
@@ -198,65 +251,66 @@
 \texttt{Solver::Options} controls the overall behavior of the solver. We list the various settings and their default values below.
 
 \begin{enumerate}
-    
-\item{\texttt{trust\_region\_strategy\_type}}(\texttt{LEVENBERG\_MARQUARDT}) The  trust region step computation algorithm used by Ceres. \texttt{LEVENBERG\_MARQUARDT} is currently the only valid value.
 
-\item{\texttt{max\_num\_iterations}}(\texttt{50}) Maximum number of iterations for Levenberg-Marquardt.
+\item{\texttt{trust\_region\_strategy\_type }} (\texttt{LEVENBERG\_MARQUARDT}) The  trust region step computation algorithm used by Ceres. Currently \texttt{LEVENBERG\_MARQUARDT } and \texttt{DOGLEG} are the two valid choices.
 
-\item{\texttt{max\_solver\_time\_sec}} ($10^9$) Maximum amount of time (in seconds) for which the solver should run.
+\item{\texttt{max\_num\_iterations }}(\texttt{50}) Maximum number of iterations for Levenberg-Marquardt.
 
-\item{\texttt{num\_threads}}(\texttt{1})
+\item{\texttt{max\_solver\_time\_in\_seconds }} ($10^9$) Maximum amount of time for which the solver should run.
+
+\item{\texttt{num\_threads }}(\texttt{1})
 Number of threads used by Ceres to evaluate the Jacobian.
 
-\item{\texttt{initial\_trust\_region\_radius} ($10^4$)} The size of the initial trust region. When the \texttt{LEVENBERG\_MARQUARDT} strategy is used, the reciprocal of this number is the initial regularization parameter.
+\item{\texttt{initial\_trust\_region\_radius } ($10^4$)} The size of the initial trust region. When the \texttt{LEVENBERG\_MARQUARDT} strategy is used, the reciprocal of this number is the initial regularization parameter.
 
-\item{\texttt{max\_trust\_region\_radius} ($10^{16}$)} The trust region radius is not allowed to grow beyond this value.
-\item{\texttt{max\_trust\_region\_radius} ($10^{-32}$)} The solver terminates, when the trust region becomes smaller than this value.
+\item{\texttt{max\_trust\_region\_radius } ($10^{16}$)} The trust region radius is not allowed to grow beyond this value.
+\item{\texttt{max\_trust\_region\_radius } ($10^{-32}$)} The solver terminates, when the trust region becomes smaller than this value.
 
-\item{\texttt{min\_relative\_decrease}}($10^{-3}$) Lower threshold for relative decrease before a Levenberg-Marquardt step is acceped.
+\item{\texttt{min\_relative\_decrease }}($10^{-3}$) Lower threshold for relative decrease before a Levenberg-Marquardt step is acceped.
 
-\item{\texttt{lm\_min\_diagonal} ($10^6$)} The \texttt{LEVENBERG\_MARQUARDT} strategy, uses a diagonal matrix to regularize the the trust region step. This is the lower bound on the values of this diagonal matrix.
+\item{\texttt{lm\_min\_diagonal } ($10^6$)} The \texttt{LEVENBERG\_MARQUARDT} strategy, uses a diagonal matrix to regularize the the trust region step. This is the lower bound on the values of this diagonal matrix.
 
-\item{\texttt{lm\_max\_diagonal} ($10^{32}$)}  The \texttt{LEVENBERG\_MARQUARDT} strategy, uses a diagonal matrix to regularize the the trust region step. This is the upper bound on the values of this diagonal matrix.
+\item{\texttt{lm\_max\_diagonal } ($10^{32}$)}  The \texttt{LEVENBERG\_MARQUARDT} strategy, uses a diagonal matrix to regularize the the trust region step. This is the upper bound on the values of this diagonal matrix.
 
-\item{\texttt{max\_num\_consecutive\_invalid\_steps} (5)} The step returned by a trust region strategy can sometimes be numerically invalid, usually because of conditioning issues. Instead of crashing or stopping the optimization, the optimizer can go ahead and try solving with a smaller trust region/better conditioned problem. This parameter sets the number of consecutive retries before the minimizer gives up.
+\item{\texttt{max\_num\_consecutive\_invalid\_steps } (5)} The step returned by a trust region strategy can sometimes be numerically invalid, usually because of conditioning issues. Instead of crashing or stopping the optimization, the optimizer can go ahead and try solving with a smaller trust region/better conditioned problem. This parameter sets the number of consecutive retries before the minimizer gives up.
 
-\item{\texttt{function\_tolerance}}($10^{-6}$) Solver terminates if
+\item{\texttt{function\_tolerance }}($10^{-6}$) Solver terminates if
 \begin{align}
 \frac{|\Delta \text{cost}|}{\text{cost}} < \texttt{function\_tolerance}
 \end{align}
 where, $\Delta \text{cost}$ is the change in objective function value (up or down) in the current iteration of Levenberg-Marquardt.
 
-\item \texttt{Solver::Options::gradient\_tolerance} Solver terminates if 
+\item \texttt{Solver::Options::gradient\_tolerance } Solver terminates if
 \begin{equation}
     \frac{\|g(x)\|_\infty}{\|g(x_0)\|_\infty} < \texttt{gradient\_tolerance}
 \end{equation}
 where $\|\cdot\|_\infty$ refers to the max norm, and $x_0$ is the vector of initial parameter values.
 
-\item{\texttt{parameter\_tolerance}}($10^{-8}$) Solver terminates if 
+\item{\texttt{parameter\_tolerance }}($10^{-8}$) Solver terminates if
 \begin{equation}
     \frac{\|\Delta x\|}{\|x\| + \texttt{parameter\_tolerance}} < \texttt{parameter\_tolerance}
 \end{equation}
 where $\Delta x$ is the step computed by the linear solver in the current iteration of Levenberg-Marquardt.
 
-\item{\texttt{linear\_solver\_type}(\texttt{SPARSE\_NORMAL\_CHOLESKY})}
+\item{\texttt{linear\_solver\_type }(\texttt{SPARSE\_NORMAL\_CHOLESKY})}
 
-\item{\texttt{linear\_solver\_type}}(\texttt{SPARSE\_NORMAL\_CHOLESKY}/\texttt{DENSE\_QR}) Type of linear solver used to compute the solution to the linear least squares problem in each iteration of the Levenberg-Marquardt algorithm. If Ceres is build with \suitesparse linked in  then the default is \texttt{SPARSE\_NORMAL\_CHOLESKY}, it is \texttt{DENSE\_QR} otherwise.
+\item{\texttt{linear\_solver\_type }}(\texttt{SPARSE\_NORMAL\_CHOLESKY}/\texttt{DENSE\_QR}) Type of linear solver used to compute the solution to the linear least squares problem in each iteration of the Levenberg-Marquardt algorithm. If Ceres is build with \suitesparse linked in  then the default is \texttt{SPARSE\_NORMAL\_CHOLESKY}, it is \texttt{DENSE\_QR} otherwise.
 
-\item{\texttt{preconditioner\_type}}(\texttt{JACOBI}) The preconditioner used by the iterative linear solver. The default is the block Jacobi preconditioner. Valid values are (in increasing order of complexity) \texttt{IDENTITY},\texttt{JACOBI}, \texttt{SCHUR\_JACOBI}, \texttt{CLUSTER\_JACOBI} and \texttt{CLUSTER\_TRIDIAGONAL}.
+\item{\texttt{preconditioner\_type }}(\texttt{JACOBI}) The preconditioner used by the iterative linear solver. The default is the block Jacobi preconditioner. Valid values are (in increasing order of complexity) \texttt{IDENTITY},\texttt{JACOBI}, \texttt{SCHUR\_JACOBI}, \texttt{CLUSTER\_JACOBI} and \texttt{CLUSTER\_TRIDIAGONAL}.
 
-\item{\texttt{sparse\_linear\_algebra\_library} (\texttt{SUITE\_SPARSE})} Ceres supports the use of two sparse linear algebra libraries, \texttt{SuiteSparse}, which is enabled by setting this parameter to \texttt{SUITE\_SPARSE} and \texttt{CXSparse}, which can be selected by setting this parameter to $\texttt{CX\_SPARSE}$. \texttt{SuiteSparse} is a sophisticated and complex sparse linear algebra library and should be used in general. If your needs/platforms prevent you from using \texttt{SuiteSparse}, consider using \texttt{CXSparse}, which is a much smaller, easier to build library. As can be expected, its performance on large problems is not comparable to that of \texttt{SuiteSparse}.
+\item{\texttt{sparse\_linear\_algebra\_library } (\texttt{SUITE\_SPARSE})} Ceres supports the use of two sparse linear algebra libraries, \texttt{SuiteSparse}, which is enabled by setting this parameter to \texttt{SUITE\_SPARSE} and \texttt{CXSparse}, which can be selected by setting this parameter to $\texttt{CX\_SPARSE}$. \texttt{SuiteSparse} is a sophisticated and complex sparse linear algebra library and should be used in general. If your needs/platforms prevent you from using \texttt{SuiteSparse}, consider using \texttt{CXSparse}, which is a much smaller, easier to build library. As can be expected, its performance on large problems is not comparable to that of \texttt{SuiteSparse}.
 
-\item{\texttt{num\_linear\_solver\_threads}}(\texttt{1}) Number of threads used by the linear solver. 
 
-\item{\texttt{num\_eliminate\_blocks}}(\texttt{0}) 
+\item{\texttt{num\_linear\_solver\_threads }}(\texttt{1}) Number of threads used by the linear solver.
+
+\item{\texttt{num\_eliminate\_blocks }}(\texttt{0})
 For Schur reduction based methods, the first 0 to num blocks are
     eliminated using the Schur reduction. For example, when solving
      traditional structure from motion problems where the parameters are in
      two classes (cameras and points) then \texttt{num\_eliminate\_blocks} would be the
      number of points.
 
-\item{\texttt{ordering\_type}}(\texttt{NATURAL})
+\item{\texttt{ordering\_type }}(\texttt{NATURAL})
  Internally Ceres reorders the parameter blocks to help the
  various linear solvers. This parameter allows the user to
      influence the re-ordering strategy used. For structure from
@@ -265,11 +319,11 @@
      scheme, for example in conjunction with \texttt{num\_eliminate\_blocks},
      use \texttt{USER}.
 
-\item{\texttt{ordering}} The ordering of the parameter blocks. The solver pays attention
+\item{\texttt{ordering }} The ordering of the parameter blocks. The solver pays attention
     to it if the \texttt{ordering\_type} is set to \texttt{USER} and the ordering vector is
     non-empty.
 
-\item{\texttt{use\_block\_amd} (\texttt{true})} By virtue of the modeling layer in Ceres being block oriented,
+\item{\texttt{use\_block\_amd } (\texttt{true})} By virtue of the modeling layer in Ceres being block oriented,
 all the matrices used by Ceres are also block oriented.
 When doing sparse direct factorization of these matrices, the
 fill-reducing ordering algorithms can either be run on the
@@ -278,69 +332,69 @@
 matrix to the Cholesky factorization routines. This leads to
 substantial gains in factorization performance. Setting this parameter to true, enables the use of a block oriented Approximate Minimum Degree ordering algorithm. Settings it to \texttt{false}, uses a scalar AMD algorithm. This option only makes sense when using \texttt{sparse\_linear\_algebra\_library = SUITE\_SPARSE} as it uses the \texttt{AMD} package that is part of \texttt{SuiteSparse}.
 
-\item{\texttt{linear\_solver\_min\_num\_iterations}}(\texttt{1}) Minimum number of iterations used by the linear solver. This only makes sense when the linear solver is an iterative solver, e.g., \texttt{ITERATIVE\_SCHUR}.
+\item{\texttt{linear\_solver\_min\_num\_iterations }}(\texttt{1}) Minimum number of iterations used by the linear solver. This only makes sense when the linear solver is an iterative solver, e.g., \texttt{ITERATIVE\_SCHUR}.
 
-\item{\texttt{linear\_solver\_max\_num\_iterations}}(\texttt{500}) Minimum number of iterations used by the linear solver. This only makes sense when the linear solver is an iterative solver, e.g., \texttt{ITERATIVE\_SCHUR}.
+\item{\texttt{linear\_solver\_max\_num\_iterations }}(\texttt{500}) Minimum number of iterations used by the linear solver. This only makes sense when the linear solver is an iterative solver, e.g., \texttt{ITERATIVE\_SCHUR}.
 
-\item{\texttt{eta}} ($10^{-1}$)
+\item{\texttt{eta }} ($10^{-1}$)
  Forcing sequence parameter. The truncated Newton solver uses
     this number to control the relative accuracy with which the
      Newton step is computed. This constant is passed to ConjugateGradientsSolver which uses
      it to terminate the iterations when
-\begin{equation}    
+\begin{equation}
       \frac{Q_i - Q_{i-1}}{Q_i} < \frac{\eta}{i}
 \end{equation}
 
-\item{\texttt{jacobi\_scaling}}(\texttt{true}) \texttt{true} means that the Jacobian is scaled by the norm of its columns before being passed to the linear solver. This improves the numerical conditioning of the normal equations.
+\item{\texttt{jacobi\_scaling }}(\texttt{true}) \texttt{true} means that the Jacobian is scaled by the norm of its columns before being passed to the linear solver. This improves the numerical conditioning of the normal equations.
 
-\item{\texttt{logging\_type}}(\texttt{PER\_MINIMIZER\_ITERATION})
+\item{\texttt{logging\_type }}(\texttt{PER\_MINIMIZER\_ITERATION})
 
 
-\item{\texttt{minimizer\_progress\_to\_stdout}}(\texttt{false})
+\item{\texttt{minimizer\_progress\_to\_stdout }}(\texttt{false})
 By default the Minimizer progress is logged to \texttt{STDERR} depending on the \texttt{vlog} level. If this flag is
-set to true, and \texttt{logging\_type} is not \texttt{SILENT}, the logging output
+set to true, and \texttt{logging\_type } is not \texttt{SILENT}, the logging output
 is sent to \texttt{STDOUT}.
 
-\item{\texttt{return\_initial\_residuals}}(\texttt{false})
-\item{\texttt{return\_final\_residuals}}(\texttt{false})
+\item{\texttt{return\_initial\_residuals }}(\texttt{false})
+\item{\texttt{return\_final\_residuals }}(\texttt{false})
 
 
-\item{\texttt{lsqp\_iterations\_to\_dump}}
+\item{\texttt{lsqp\_iterations\_to\_dump }}
  List of iterations at which the optimizer should dump the
      linear least squares problem to disk. Useful for testing and
      benchmarking. If empty (default), no problems are dumped.
 
-\item{\texttt{lsqp\_dump\_directory}} (\texttt{/tmp})
+\item{\texttt{lsqp\_dump\_directory }} (\texttt{/tmp})
  If \texttt{lsqp\_iterations\_to\_dump} is non-empty, then this setting determines the directory to which the files containing the linear least squares problems are written to.
 
 
-\item{\texttt{lsqp\_dump\_format}}(\texttt{TEXTFILE}) The format in which linear least squares problems should be logged
+\item{\texttt{lsqp\_dump\_format }}(\texttt{TEXTFILE}) The format in which linear least squares problems should be logged
 when \texttt{lsqp\_iterations\_to\_dump} is non-empty.  There are three options
 \begin{itemize}
-\item{\texttt{CONSOLE}} prints the linear least squares problem in a human readable format
+\item{\texttt{CONSOLE }} prints the linear least squares problem in a human readable format
   to \texttt{stderr}. The Jacobian is printed as a dense matrix. The vectors
    $D$, $x$ and $f$ are printed as dense vectors. This should only be used
    for small problems.
-\item{\texttt{PROTOBUF}}  
+\item{\texttt{PROTOBUF }}
    Write out the linear least squares problem to the directory
    pointed to by \texttt{lsqp\_dump\_directory} as a protocol
    buffer. \texttt{linear\_least\_squares\_problems.h/cc} contains routines for
    loading these problems. For details on the on disk format used,
    see \texttt{matrix.proto}. The files are named \texttt{lm\_iteration\_???.lsqp}. This requires that \texttt{protobuf} be linked into Ceres Solver.
-\item{\texttt{TEXTFILE}}
+\item{\texttt{TEXTFILE }}
    Write out the linear least squares problem to the directory
    pointed to by \texttt{lsqp\_dump\_directory} as text files
    which can be read into \texttt{MATLAB/Octave}. The Jacobian is dumped as a
    text file containing $(i,j,s)$ triplets, the vectors $D$, $x$ and $f$ are
    dumped as text files containing a list of their values.
-  
+
    A \texttt{MATLAB/Octave} script called \texttt{lm\_iteration\_???.m} is also output,
    which can be used to parse and load the problem into memory.
 \end{itemize}
 
 
 
-\item{\texttt{check\_gradients}}(\texttt{false})
+\item{\texttt{check\_gradients }}(\texttt{false})
  Check all Jacobians computed by each residual block with finite
      differences. This is expensive since it involves computing the
      derivative by normal means (e.g. user specified, autodiff,
@@ -348,20 +402,20 @@
      results are compared, and if they differ substantially, details
      are printed to the log.
 
-\item{\texttt{gradient\_check\_relative\_precision}} ($10^{-8}$)
+\item{\texttt{gradient\_check\_relative\_precision }} ($10^{-8}$)
   Relative precision to check for in the gradient checker. If the
   relative difference between an element in a Jacobian exceeds
   this number, then the Jacobian for that cost term is dumped.
 
-\item{\texttt{numeric\_derivative\_relative\_step\_size}} ($10^{-6}$)
+\item{\texttt{numeric\_derivative\_relative\_step\_size }} ($10^{-6}$)
  Relative shift used for taking numeric derivatives. For finite
      differencing, each dimension is evaluated at slightly shifted
      values, \eg for forward differences, the numerical derivative is
-  
+
 \begin{align}
        \delta &= \texttt{numeric\_derivative\_relative\_step\_size}\\
        \Delta f &= \frac{f((1 + \delta)  x) - f(x)}{\delta x}
-\end{align} 
+\end{align}
 
 
      The finite differencing is done along each dimension. The
@@ -372,7 +426,7 @@
      "torture cases" which break this finite difference heuristic,
      but they do not come up often in practice.
 
-\item{\texttt{callbacks}} 
+\item{\texttt{callbacks }}
   Callbacks that are executed at the end of each iteration of the
      \texttt{Minimizer}. They are executed in the order that they are
      specified in this vector. By default, parameter blocks are
@@ -381,12 +435,12 @@
      \texttt{update\_state\_every\_variable}. If the user wishes to have access
      to the update parameter blocks when his/her callbacks are
      executed, then set \texttt{update\_state\_every\_iteration} to true.
-    
+
      The solver does NOT take ownership of these pointers.
 
-\item{\texttt{update\_state\_every\_iteration}}(\texttt{false})
+\item{\texttt{update\_state\_every\_iteration }}(\texttt{false})
 Normally the parameter blocks are only updated when the solver terminates. Setting this to true update them in every iteration. This setting is useful when building an interactive application using Ceres and using an \texttt{IterationCallback}.
 \end{enumerate}
 
 \section{\texttt{Solver::Summary}}
-TBD
\ No newline at end of file
+TBD
diff --git a/include/ceres/solver.h b/include/ceres/solver.h
index 5ca15e9..1084884 100644
--- a/include/ceres/solver.h
+++ b/include/ceres/solver.h
@@ -78,21 +78,23 @@
       linear_solver_type = SPARSE_NORMAL_CHOLESKY;
 #endif
 
+      preconditioner_type = JACOBI;
+
       sparse_linear_algebra_library = SUITE_SPARSE;
 #if defined(CERES_NO_SUITESPARSE) && !defined(CERES_NO_CXSPARSE)
       sparse_linear_algebra_library = CX_SPARSE;
 #endif
 
+      num_linear_solver_threads = 1;
+      num_eliminate_blocks = 0;
+      ordering_type = NATURAL;
+
 #if defined(CERES_NO_SUITESPARSE)
       use_block_amd = false;
 #else
       use_block_amd = true;
 #endif
 
-      preconditioner_type = JACOBI;
-      num_linear_solver_threads = 1;
-      num_eliminate_blocks = 0;
-      ordering_type = NATURAL;
       linear_solver_min_num_iterations = 1;
       linear_solver_max_num_iterations = 500;
       eta = 1e-1;
@@ -419,6 +421,9 @@
 
     PreconditionerType preconditioner_type;
     OrderingType ordering_type;
+
+    TrustRegionStrategyType trust_region_strategy_type;
+    SparseLinearAlgebraLibraryType sparse_linear_algebra_library;
   };
 
   // Once a least squares problem has been built, this function takes
diff --git a/internal/ceres/solver.cc b/internal/ceres/solver.cc
index 901e543..c61383c 100644
--- a/internal/ceres/solver.cc
+++ b/internal/ceres/solver.cc
@@ -89,7 +89,9 @@
       linear_solver_type_given(SPARSE_NORMAL_CHOLESKY),
       linear_solver_type_used(SPARSE_NORMAL_CHOLESKY),
       preconditioner_type(IDENTITY),
-      ordering_type(NATURAL) {
+      ordering_type(NATURAL),
+      trust_region_strategy_type(LEVENBERG_MARQUARDT),
+      sparse_linear_algebra_library(SUITE_SPARSE) {
 }
 
 string Solver::Summary::BriefReport() const {
@@ -179,10 +181,17 @@
 
   internal::StringAppendF(&report, "Threads:            % 25d% 25d\n",
                           num_threads_given, num_threads_used);
-  internal::StringAppendF(&report, "Linear solver threads:% 23d% 25d\n",
+  internal::StringAppendF(&report, "Linear solver threads % 23d% 25d\n",
                           num_linear_solver_threads_given,
                           num_linear_solver_threads_used);
 
+  internal::StringAppendF(&report, "\nSparse Linear Algebra Library %15s\n",
+                          SparseLinearAlgebraLibraryTypeToString(
+                              sparse_linear_algebra_library));
+
+  internal::StringAppendF(&report, "Trust Region Strategy     %19s\n",
+                          TrustRegionStrategyTypeToString(
+                              trust_region_strategy_type));
 
   if (termination_type == DID_NOT_RUN) {
     CHECK(!error.empty())
diff --git a/internal/ceres/solver_impl.cc b/internal/ceres/solver_impl.cc
index 41b6d75..877a5af 100644
--- a/internal/ceres/solver_impl.cc
+++ b/internal/ceres/solver_impl.cc
@@ -203,6 +203,9 @@
   summary->num_residuals = problem_impl->NumResiduals();
 
   summary->num_threads_used = options.num_threads;
+  summary->sparse_linear_algebra_library =
+      options.sparse_linear_algebra_library;
+  summary->trust_region_strategy_type = options.trust_region_strategy_type;
 
   // Evaluate the initial cost and residual vector (if needed). The
   // initial cost needs to be computed on the original unpreprocessed