blob: 7f58ec8264e3116367d207d39e1e3709e2107d66 [file] [log] [blame]
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001
2.. default-domain:: cpp
3
4.. cpp:namespace:: ceres
5
6.. _chapter-solving:
7
8==========
9Solver API
10==========
11
Sameer Agarwalfbbea462013-02-15 11:25:03 -080012
13Introduction
14============
15
Sameer Agarwal3d87b722013-02-02 00:49:31 -080016Effective use of Ceres requires some familiarity with the basic
17components of a nonlinear least squares solver, so before we describe
18how to configure the solver, we will begin by taking a brief look at
19how some of the core optimization algorithms in Ceres work and the
20various linear solvers and preconditioners that power it.
21
Sameer Agarwal3d87b722013-02-02 00:49:31 -080022
23Let :math:`x \in \mathbb{R}^n` be an :math:`n`-dimensional vector of
24variables, and
25:math:`F(x) = \left[f_1(x), ... , f_{m}(x) \right]^{\top}` be a
26:math:`m`-dimensional function of :math:`x`. We are interested in
27solving the following optimization problem [#f1]_ .
28
29.. math:: \arg \min_x \frac{1}{2}\|F(x)\|^2\ .
30 :label: nonlinsq
31
32Here, the Jacobian :math:`J(x)` of :math:`F(x)` is an :math:`m\times
33n` matrix, where :math:`J_{ij}(x) = \partial_j f_i(x)` and the
34gradient vector :math:`g(x) = \nabla \frac{1}{2}\|F(x)\|^2 = J(x)^\top
Sameer Agarwalfbbea462013-02-15 11:25:03 -080035F(x)`. Since the efficient global minimization of :eq:`nonlinsq` for
36general :math:`F(x)` is an intractable problem, we will have to settle
37for finding a local minimum.
Sameer Agarwal3d87b722013-02-02 00:49:31 -080038
39The general strategy when solving non-linear optimization problems is
40to solve a sequence of approximations to the original problem
41[NocedalWright]_. At each iteration, the approximation is solved to
42determine a correction :math:`\Delta x` to the vector :math:`x`. For
43non-linear least squares, an approximation can be constructed by using
44the linearization :math:`F(x+\Delta x) \approx F(x) + J(x)\Delta x`,
Sameer Agarwalfbbea462013-02-15 11:25:03 -080045which leads to the following linear least squares problem:
Sameer Agarwal3d87b722013-02-02 00:49:31 -080046
47.. math:: \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2
48 :label: linearapprox
49
50Unfortunately, naively solving a sequence of these problems and
Sameer Agarwalfbbea462013-02-15 11:25:03 -080051updating :math:`x \leftarrow x+ \Delta x` leads to an algorithm that
52may not converge. To get a convergent algorithm, we need to control
53the size of the step :math:`\Delta x`. Depending on how the size of
54the step :math:`\Delta x` is controlled, non-linear optimization
55algorithms can be divided into two major categories [NocedalWright]_.
Sameer Agarwal3d87b722013-02-02 00:49:31 -080056
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800571. **Trust Region** The trust region approach approximates the
58 objective function using using a model function (often a quadratic)
59 over a subset of the search space known as the trust region. If the
60 model function succeeds in minimizing the true objective function
61 the trust region is expanded; conversely, otherwise it is
62 contracted and the model optimization problem is solved again.
Sameer Agarwal3d87b722013-02-02 00:49:31 -080063
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800642. **Line Search** The line search approach first finds a descent
65 direction along which the objective function will be reduced and
66 then computes a step size that decides how far should move along
67 that direction. The descent direction can be computed by various
68 methods, such as gradient descent, Newton's method and Quasi-Newton
69 method. The step size can be determined either exactly or
70 inexactly.
71
72Trust region methods are in some sense dual to line search methods:
73trust region methods first choose a step size (the size of the trust
74region) and then a step direction while line search methods first
75choose a step direction and then a step size.
76
77Ceres implements multiple algorithms in both categories.
78
79.. _section-trust-region-methods:
80
81Trust Region Methods
82====================
83
84The basic trust region algorithm looks something like this.
85
86 1. Given an initial point :math:`x` and a trust region radius :math:`\mu`.
87 2. :math:`\arg \min_{\Delta x} \frac{1}{2}\|J(x)\Delta
88 x + F(x)\|^2` s.t. :math:`\|D(x)\Delta x\|^2 \le \mu`
89 3. :math:`\rho = \frac{\displaystyle \|F(x + \Delta x)\|^2 -
90 \|F(x)\|^2}{\displaystyle \|J(x)\Delta x + F(x)\|^2 -
91 \|F(x)\|^2}`
92 4. if :math:`\rho > \epsilon` then :math:`x = x + \Delta x`.
93 5. if :math:`\rho > \eta_1` then :math:`\rho = 2 \rho`
94 6. else if :math:`\rho < \eta_2` then :math:`\rho = 0.5 * \rho`
95 7. Goto 2.
Sameer Agarwal3d87b722013-02-02 00:49:31 -080096
97Here, :math:`\mu` is the trust region radius, :math:`D(x)` is some
98matrix used to define a metric on the domain of :math:`F(x)` and
99:math:`\rho` measures the quality of the step :math:`\Delta x`, i.e.,
100how well did the linear model predict the decrease in the value of the
101non-linear objective. The idea is to increase or decrease the radius
102of the trust region depending on how well the linearization predicts
103the behavior of the non-linear objective, which in turn is reflected
104in the value of :math:`\rho`.
105
106The key computational step in a trust-region algorithm is the solution
107of the constrained optimization problem
108
109.. math:: \arg\min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2\quad \text{such that}\quad \|D(x)\Delta x\|^2 \le \mu
110 :label: trp
111
112There are a number of different ways of solving this problem, each
113giving rise to a different concrete trust-region algorithm. Currently
114Ceres, implements two trust-region algorithms - Levenberg-Marquardt
115and Dogleg. The user can choose between them by setting
116:member:`Solver::Options::trust_region_strategy_type`.
117
118.. rubric:: Footnotes
119
120.. [#f1] At the level of the non-linear solver, the block and
121 structure is not relevant, therefore our discussion here is
122 in terms of an optimization problem defined over a state
123 vector of size :math:`n`.
124
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800125
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800126.. _section-levenberg-marquardt:
127
128Levenberg-Marquardt
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800129-------------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800130
131The Levenberg-Marquardt algorithm [Levenberg]_ [Marquardt]_ is the
132most popular algorithm for solving non-linear least squares problems.
133It was also the first trust region algorithm to be developed
134[Levenberg]_ [Marquardt]_. Ceres implements an exact step [Madsen]_
135and an inexact step variant of the Levenberg-Marquardt algorithm
136[WrightHolt]_ [NashSofer]_.
137
138It can be shown, that the solution to :eq:`trp` can be obtained by
139solving an unconstrained optimization of the form
140
141.. math:: \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 +\lambda \|D(x)\Delta x\|^2
142
143Where, :math:`\lambda` is a Lagrange multiplier that is inverse
144related to :math:`\mu`. In Ceres, we solve for
145
146.. math:: \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 + \frac{1}{\mu} \|D(x)\Delta x\|^2
147 :label: lsqr
148
149The matrix :math:`D(x)` is a non-negative diagonal matrix, typically
150the square root of the diagonal of the matrix :math:`J(x)^\top J(x)`.
151
152Before going further, let us make some notational simplifications. We
153will assume that the matrix :math:`\sqrt{\mu} D` has been concatenated
154at the bottom of the matrix :math:`J` and similarly a vector of zeros
155has been added to the bottom of the vector :math:`f` and the rest of
156our discussion will be in terms of :math:`J` and :math:`f`, i.e, the
157linear least squares problem.
158
159.. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
160 :label: simple
161
162For all but the smallest problems the solution of :eq:`simple` in
163each iteration of the Levenberg-Marquardt algorithm is the dominant
164computational cost in Ceres. Ceres provides a number of different
165options for solving :eq:`simple`. There are two major classes of
166methods - factorization and iterative.
167
168The factorization methods are based on computing an exact solution of
169:eq:`lsqr` using a Cholesky or a QR factorization and lead to an exact
170step Levenberg-Marquardt algorithm. But it is not clear if an exact
171solution of :eq:`lsqr` is necessary at each step of the LM algorithm
172to solve :eq:`nonlinsq`. In fact, we have already seen evidence
173that this may not be the case, as :eq:`lsqr` is itself a regularized
174version of :eq:`linearapprox`. Indeed, it is possible to
175construct non-linear optimization algorithms in which the linearized
176problem is solved approximately. These algorithms are known as inexact
177Newton or truncated Newton methods [NocedalWright]_.
178
179An inexact Newton method requires two ingredients. First, a cheap
180method for approximately solving systems of linear
181equations. Typically an iterative linear solver like the Conjugate
182Gradients method is used for this
183purpose [NocedalWright]_. Second, a termination rule for
184the iterative solver. A typical termination rule is of the form
185
186.. math:: \|H(x) \Delta x + g(x)\| \leq \eta_k \|g(x)\|.
187 :label: inexact
188
189Here, :math:`k` indicates the Levenberg-Marquardt iteration number and
190:math:`0 < \eta_k <1` is known as the forcing sequence. [WrightHolt]_
191prove that a truncated Levenberg-Marquardt algorithm that uses an
192inexact Newton step based on :eq:`inexact` converges for any
193sequence :math:`\eta_k \leq \eta_0 < 1` and the rate of convergence
194depends on the choice of the forcing sequence :math:`\eta_k`.
195
196Ceres supports both exact and inexact step solution strategies. When
197the user chooses a factorization based linear solver, the exact step
198Levenberg-Marquardt algorithm is used. When the user chooses an
199iterative linear solver, the inexact step Levenberg-Marquardt
200algorithm is used.
201
202.. _section-dogleg:
203
204Dogleg
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800205------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800206
207Another strategy for solving the trust region problem :eq:`trp` was
208introduced by M. J. D. Powell. The key idea there is to compute two
209vectors
210
211.. math::
212
213 \Delta x^{\text{Gauss-Newton}} &= \arg \min_{\Delta x}\frac{1}{2} \|J(x)\Delta x + f(x)\|^2.\\
214 \Delta x^{\text{Cauchy}} &= -\frac{\|g(x)\|^2}{\|J(x)g(x)\|^2}g(x).
215
216Note that the vector :math:`\Delta x^{\text{Gauss-Newton}}` is the
217solution to :eq:`linearapprox` and :math:`\Delta
218x^{\text{Cauchy}}` is the vector that minimizes the linear
219approximation if we restrict ourselves to moving along the direction
220of the gradient. Dogleg methods finds a vector :math:`\Delta x`
221defined by :math:`\Delta x^{\text{Gauss-Newton}}` and :math:`\Delta
222x^{\text{Cauchy}}` that solves the trust region problem. Ceres
223supports two variants that can be chose by setting
224:member:`Solver::Options::dogleg_type`.
225
226``TRADITIONAL_DOGLEG`` as described by Powell, constructs two line
227segments using the Gauss-Newton and Cauchy vectors and finds the point
228farthest along this line shaped like a dogleg (hence the name) that is
229contained in the trust-region. For more details on the exact reasoning
230and computations, please see Madsen et al [Madsen]_.
231
232``SUBSPACE_DOGLEG`` is a more sophisticated method that considers the
233entire two dimensional subspace spanned by these two vectors and finds
Sameer Agarwalfa21df82013-02-18 08:48:52 -0800234the point that minimizes the trust region problem in this subspace
235[ByrdSchnabel]_.
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800236
237The key advantage of the Dogleg over Levenberg Marquardt is that if
238the step computation for a particular choice of :math:`\mu` does not
239result in sufficient decrease in the value of the objective function,
240Levenberg-Marquardt solves the linear approximation from scratch with
241a smaller value of :math:`\mu`. Dogleg on the other hand, only needs
242to compute the interpolation between the Gauss-Newton and the Cauchy
243vectors, as neither of them depend on the value of :math:`\mu`.
244
245The Dogleg method can only be used with the exact factorization based
246linear solvers.
247
248.. _section-inner-iterations:
249
250Inner Iterations
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800251----------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800252
253Some non-linear least squares problems have additional structure in
254the way the parameter blocks interact that it is beneficial to modify
255the way the trust region step is computed. e.g., consider the
256following regression problem
257
258.. math:: y = a_1 e^{b_1 x} + a_2 e^{b_3 x^2 + c_1}
259
260
261Given a set of pairs :math:`\{(x_i, y_i)\}`, the user wishes to estimate
262:math:`a_1, a_2, b_1, b_2`, and :math:`c_1`.
263
264Notice that the expression on the left is linear in :math:`a_1` and
265:math:`a_2`, and given any value for :math:`b_1, b_2` and :math:`c_1`,
266it is possible to use linear regression to estimate the optimal values
267of :math:`a_1` and :math:`a_2`. It's possible to analytically
268eliminate the variables :math:`a_1` and :math:`a_2` from the problem
269entirely. Problems like these are known as separable least squares
270problem and the most famous algorithm for solving them is the Variable
271Projection algorithm invented by Golub & Pereyra [GolubPereyra]_.
272
273Similar structure can be found in the matrix factorization with
274missing data problem. There the corresponding algorithm is known as
275Wiberg's algorithm [Wiberg]_.
276
277Ruhe & Wedin present an analysis of various algorithms for solving
278separable non-linear least squares problems and refer to *Variable
279Projection* as Algorithm I in their paper [RuheWedin]_.
280
281Implementing Variable Projection is tedious and expensive. Ruhe &
282Wedin present a simpler algorithm with comparable convergence
283properties, which they call Algorithm II. Algorithm II performs an
284additional optimization step to estimate :math:`a_1` and :math:`a_2`
285exactly after computing a successful Newton step.
286
287
288This idea can be generalized to cases where the residual is not
289linear in :math:`a_1` and :math:`a_2`, i.e.,
290
291.. math:: y = f_1(a_1, e^{b_1 x}) + f_2(a_2, e^{b_3 x^2 + c_1})
292
293In this case, we solve for the trust region step for the full problem,
294and then use it as the starting point to further optimize just `a_1`
295and `a_2`. For the linear case, this amounts to doing a single linear
296least squares solve. For non-linear problems, any method for solving
297the `a_1` and `a_2` optimization problems will do. The only constraint
298on `a_1` and `a_2` (if they are two different parameter block) is that
299they do not co-occur in a residual block.
300
301This idea can be further generalized, by not just optimizing
302:math:`(a_1, a_2)`, but decomposing the graph corresponding to the
303Hessian matrix's sparsity structure into a collection of
304non-overlapping independent sets and optimizing each of them.
305
306Setting :member:`Solver::Options::use_inner_iterations` to ``true``
307enables the use of this non-linear generalization of Ruhe & Wedin's
308Algorithm II. This version of Ceres has a higher iteration
309complexity, but also displays better convergence behavior per
310iteration.
311
312Setting :member:`Solver::Options::num_threads` to the maximum number
313possible is highly recommended.
314
315.. _section-non-monotonic-steps:
316
317Non-monotonic Steps
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800318-------------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800319
320Note that the basic trust-region algorithm described in
321Algorithm~\ref{alg:trust-region} is a descent algorithm in that they
322only accepts a point if it strictly reduces the value of the objective
323function.
324
325Relaxing this requirement allows the algorithm to be more efficient in
326the long term at the cost of some local increase in the value of the
327objective function.
328
329This is because allowing for non-decreasing objective function values
330in a princpled manner allows the algorithm to *jump over boulders* as
331the method is not restricted to move into narrow valleys while
332preserving its convergence properties.
333
334Setting :member:`Solver::Options::use_nonmonotonic_steps` to ``true``
335enables the non-monotonic trust region algorithm as described by Conn,
336Gould & Toint in [Conn]_.
337
338Even though the value of the objective function may be larger
339than the minimum value encountered over the course of the
340optimization, the final parameters returned to the user are the
341ones corresponding to the minimum cost over all iterations.
342
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800343The option to take non-monotonic steps is available for all trust
344region strategies.
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800345
346
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800347.. _section-line-search-methods:
348
349Line Search Methods
350===================
351
352**The implementation of line search algorithms in Ceres Solver is
353fairly new and not very well tested, so for now this part of the
354solver should be considered beta quality. We welcome reports of your
355experiences both good and bad on the mailinglist.**
356
357Line search algorithms
358
359 1. Given an initial point :math:`x`
360 2. :math:`\Delta x = -H^{-1}(x) g(x)`
361 3. :math:`\arg \min_\mu \frac{1}{2} \| F(x + \mu \Delta x) \|^2`
362 4. :math:`x = x + \mu \Delta x`
363 5. Goto 2.
364
365Here :math:`H(x)` is some approximation to the Hessian of the
366objective function, and :math:`g(x)` is the gradient at
367:math:`x`. Depending on the choice of :math:`H(x)` we get a variety of
368different search directions -`\Delta x`.
369
370Step 4, which is a one dimensional optimization or `Line Search` along
371:math:`\Delta x` is what gives this class of methods its name.
372
373Different line search algorithms differ in their choice of the search
374direction :math:`\Delta x` and the method used for one dimensional
375optimization along :math:`\Delta x`. The choice of :math:`H(x)` is the
376primary source of computational complexity in these
377methods. Currently, Ceres Solver supports three choices of search
378directions, all aimed at large scale problems.
379
3801. ``STEEPEST_DESCENT`` This corresponds to choosing :math:`H(x)` to
381 be the identity matrix. This is not a good search direction for
382 anything but the simplest of the problems. It is only included here
383 for completeness.
384
3852. ``NONLINEAR_CONJUGATE_GRADIENT`` A generalization of the Conjugate
386 Gradient method to non-linear functions. The generalization can be
387 performed in a number of different ways, resulting in a variety of
388 search directions. Ceres Solver currently supports
389 ``FLETCHER_REEVES``, ``POLAK_RIBIRERE`` and ``HESTENES_STIEFEL``
390 directions.
391
3923. ``LBFGS`` In this method, a limited memory approximation to the
393 inverse Hessian is maintained and used to compute a quasi-Newton
394 step [Nocedal]_, [ByrdNocedal]_.
395
396Currently Ceres Solver uses a backtracking and interpolation based
397Armijo line search algorithm.
398
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800399.. _section-linear-solver:
400
401LinearSolver
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800402============
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800403
404Recall that in both of the trust-region methods described above, the
405key computational cost is the solution of a linear least squares
406problem of the form
407
408.. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
409 :label: simple2
410
411Let :math:`H(x)= J(x)^\top J(x)` and :math:`g(x) = -J(x)^\top
412f(x)`. For notational convenience let us also drop the dependence on
413:math:`x`. Then it is easy to see that solving :eq:`simple2` is
414equivalent to solving the *normal equations*.
415
416.. math:: H \Delta x = g
417 :label: normal
418
419Ceres provides a number of different options for solving :eq:`normal`.
420
421.. _section-qr:
422
423``DENSE_QR``
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800424------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800425
426For small problems (a couple of hundred parameters and a few thousand
427residuals) with relatively dense Jacobians, ``DENSE_QR`` is the method
428of choice [Bjorck]_. Let :math:`J = QR` be the QR-decomposition of
429:math:`J`, where :math:`Q` is an orthonormal matrix and :math:`R` is
430an upper triangular matrix [TrefethenBau]_. Then it can be shown that
431the solution to :eq:`normal` is given by
432
433.. math:: \Delta x^* = -R^{-1}Q^\top f
434
435
436Ceres uses ``Eigen`` 's dense QR factorization routines.
437
438.. _section-cholesky:
439
440``DENSE_NORMAL_CHOLESKY`` & ``SPARSE_NORMAL_CHOLESKY``
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800441------------------------------------------------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800442
443Large non-linear least square problems are usually sparse. In such
444cases, using a dense QR factorization is inefficient. Let :math:`H =
445R^\top R` be the Cholesky factorization of the normal equations, where
446:math:`R` is an upper triangular matrix, then the solution to
447:eq:`normal` is given by
448
449.. math::
450
451 \Delta x^* = R^{-1} R^{-\top} g.
452
453
454The observant reader will note that the :math:`R` in the Cholesky
455factorization of :math:`H` is the same upper triangular matrix
456:math:`R` in the QR factorization of :math:`J`. Since :math:`Q` is an
457orthonormal matrix, :math:`J=QR` implies that :math:`J^\top J = R^\top
458Q^\top Q R = R^\top R`. There are two variants of Cholesky
459factorization -- sparse and dense.
460
461``DENSE_NORMAL_CHOLESKY`` as the name implies performs a dense
462Cholesky factorization of the normal equations. Ceres uses
463``Eigen`` 's dense LDLT factorization routines.
464
465``SPARSE_NORMAL_CHOLESKY``, as the name implies performs a sparse
466Cholesky factorization of the normal equations. This leads to
467substantial savings in time and memory for large sparse
468problems. Ceres uses the sparse Cholesky factorization routines in
469Professor Tim Davis' ``SuiteSparse`` or ``CXSparse`` packages [Chen]_.
470
471.. _section-schur:
472
473``DENSE_SCHUR`` & ``SPARSE_SCHUR``
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800474----------------------------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800475
476While it is possible to use ``SPARSE_NORMAL_CHOLESKY`` to solve bundle
477adjustment problems, bundle adjustment problem have a special
478structure, and a more efficient scheme for solving :eq:`normal`
479can be constructed.
480
481Suppose that the SfM problem consists of :math:`p` cameras and
482:math:`q` points and the variable vector :math:`x` has the block
483structure :math:`x = [y_{1}, ... ,y_{p},z_{1}, ... ,z_{q}]`. Where,
484:math:`y` and :math:`z` correspond to camera and point parameters,
485respectively. Further, let the camera blocks be of size :math:`c` and
486the point blocks be of size :math:`s` (for most problems :math:`c` =
487:math:`6`--`9` and :math:`s = 3`). Ceres does not impose any constancy
488requirement on these block sizes, but choosing them to be constant
489simplifies the exposition.
490
491A key characteristic of the bundle adjustment problem is that there is
492no term :math:`f_{i}` that includes two or more point blocks. This in
493turn implies that the matrix :math:`H` is of the form
494
495.. math:: H = \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix} \right]\ ,
496 :label: hblock
497
498where, :math:`B \in \mathbb{R}^{pc\times pc}` is a block sparse matrix
499with :math:`p` blocks of size :math:`c\times c` and :math:`C \in
500\mathbb{R}^{qs\times qs}` is a block diagonal matrix with :math:`q` blocks
501of size :math:`s\times s`. :math:`E \in \mathbb{R}^{pc\times qs}` is a
502general block sparse matrix, with a block of size :math:`c\times s`
503for each observation. Let us now block partition :math:`\Delta x =
504[\Delta y,\Delta z]` and :math:`g=[v,w]` to restate :eq:`normal`
505as the block structured linear system
506
507.. math:: \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix}
508 \right]\left[ \begin{matrix} \Delta y \\ \Delta z
509 \end{matrix} \right] = \left[ \begin{matrix} v\\ w
510 \end{matrix} \right]\ ,
511 :label: linear2
512
513and apply Gaussian elimination to it. As we noted above, :math:`C` is
514a block diagonal matrix, with small diagonal blocks of size
515:math:`s\times s`. Thus, calculating the inverse of :math:`C` by
516inverting each of these blocks is cheap. This allows us to eliminate
517:math:`\Delta z` by observing that :math:`\Delta z = C^{-1}(w - E^\top
518\Delta y)`, giving us
519
520.. math:: \left[B - EC^{-1}E^\top\right] \Delta y = v - EC^{-1}w\ .
521 :label: schur
522
523The matrix
524
525.. math:: S = B - EC^{-1}E^\top
526
527is the Schur complement of :math:`C` in :math:`H`. It is also known as
528the *reduced camera matrix*, because the only variables
529participating in :eq:`schur` are the ones corresponding to the
530cameras. :math:`S \in \mathbb{R}^{pc\times pc}` is a block structured
531symmetric positive definite matrix, with blocks of size :math:`c\times
532c`. The block :math:`S_{ij}` corresponding to the pair of images
533:math:`i` and :math:`j` is non-zero if and only if the two images
534observe at least one common point.
535
536
537Now, eq-linear2 can be solved by first forming :math:`S`, solving for
538:math:`\Delta y`, and then back-substituting :math:`\Delta y` to
539obtain the value of :math:`\Delta z`. Thus, the solution of what was
540an :math:`n\times n`, :math:`n=pc+qs` linear system is reduced to the
541inversion of the block diagonal matrix :math:`C`, a few matrix-matrix
542and matrix-vector multiplies, and the solution of block sparse
543:math:`pc\times pc` linear system :eq:`schur`. For almost all
544problems, the number of cameras is much smaller than the number of
545points, :math:`p \ll q`, thus solving :eq:`schur` is
546significantly cheaper than solving :eq:`linear2`. This is the
547*Schur complement trick* [Brown]_.
548
549This still leaves open the question of solving :eq:`schur`. The
550method of choice for solving symmetric positive definite systems
551exactly is via the Cholesky factorization [TrefethenBau]_ and
552depending upon the structure of the matrix, there are, in general, two
553options. The first is direct factorization, where we store and factor
554:math:`S` as a dense matrix [TrefethenBau]_. This method has
555:math:`O(p^2)` space complexity and :math:`O(p^3)` time complexity and
556is only practical for problems with up to a few hundred cameras. Ceres
557implements this strategy as the ``DENSE_SCHUR`` solver.
558
559
560But, :math:`S` is typically a fairly sparse matrix, as most images
561only see a small fraction of the scene. This leads us to the second
562option: Sparse Direct Methods. These methods store :math:`S` as a
563sparse matrix, use row and column re-ordering algorithms to maximize
564the sparsity of the Cholesky decomposition, and focus their compute
565effort on the non-zero part of the factorization [Chen]_. Sparse
566direct methods, depending on the exact sparsity structure of the Schur
567complement, allow bundle adjustment algorithms to significantly scale
568up over those based on dense factorization. Ceres implements this
569strategy as the ``SPARSE_SCHUR`` solver.
570
571.. _section-cgnr:
572
573``CGNR``
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800574--------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800575
576For general sparse problems, if the problem is too large for
577``CHOLMOD`` or a sparse linear algebra library is not linked into
578Ceres, another option is the ``CGNR`` solver. This solver uses the
579Conjugate Gradients solver on the *normal equations*, but without
580forming the normal equations explicitly. It exploits the relation
581
582.. math::
583 H x = J^\top J x = J^\top(J x)
584
585
586When the user chooses ``ITERATIVE_SCHUR`` as the linear solver, Ceres
587automatically switches from the exact step algorithm to an inexact
588step algorithm.
589
590.. _section-iterative_schur:
591
592``ITERATIVE_SCHUR``
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800593-------------------
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800594
595Another option for bundle adjustment problems is to apply PCG to the
596reduced camera matrix :math:`S` instead of :math:`H`. One reason to do
597this is that :math:`S` is a much smaller matrix than :math:`H`, but
598more importantly, it can be shown that :math:`\kappa(S)\leq
599\kappa(H)`. Cseres implements PCG on :math:`S` as the
600``ITERATIVE_SCHUR`` solver. When the user chooses ``ITERATIVE_SCHUR``
601as the linear solver, Ceres automatically switches from the exact step
602algorithm to an inexact step algorithm.
603
604The cost of forming and storing the Schur complement :math:`S` can be
605prohibitive for large problems. Indeed, for an inexact Newton solver
606that computes :math:`S` and runs PCG on it, almost all of its time is
607spent in constructing :math:`S`; the time spent inside the PCG
608algorithm is negligible in comparison. Because PCG only needs access
609to :math:`S` via its product with a vector, one way to evaluate
610:math:`Sx` is to observe that
611
612.. math:: x_1 &= E^\top x
613.. math:: x_2 &= C^{-1} x_1
614.. math:: x_3 &= Ex_2\\
615.. math:: x_4 &= Bx\\
616.. math:: Sx &= x_4 - x_3
617 :label: schurtrick1
618
619Thus, we can run PCG on :math:`S` with the same computational effort
620per iteration as PCG on :math:`H`, while reaping the benefits of a
621more powerful preconditioner. In fact, we do not even need to compute
622:math:`H`, :eq:`schurtrick1` can be implemented using just the columns
623of :math:`J`.
624
625Equation :eq:`schurtrick1` is closely related to *Domain
626Decomposition methods* for solving large linear systems that arise in
627structural engineering and partial differential equations. In the
628language of Domain Decomposition, each point in a bundle adjustment
629problem is a domain, and the cameras form the interface between these
630domains. The iterative solution of the Schur complement then falls
631within the sub-category of techniques known as Iterative
632Sub-structuring [Saad]_ [Mathew]_.
633
634.. _section-preconditioner:
635
636Preconditioner
637--------------
638
639The convergence rate of Conjugate Gradients for
640solving :eq:`normal` depends on the distribution of eigenvalues
641of :math:`H` [Saad]_. A useful upper bound is
642:math:`\sqrt{\kappa(H)}`, where, :math:`\kappa(H)` is the condition
643number of the matrix :math:`H`. For most bundle adjustment problems,
644:math:`\kappa(H)` is high and a direct application of Conjugate
645Gradients to :eq:`normal` results in extremely poor performance.
646
647The solution to this problem is to replace :eq:`normal` with a
648*preconditioned* system. Given a linear system, :math:`Ax =b` and a
649preconditioner :math:`M` the preconditioned system is given by
650:math:`M^{-1}Ax = M^{-1}b`. The resulting algorithm is known as
651Preconditioned Conjugate Gradients algorithm (PCG) and its worst case
652complexity now depends on the condition number of the *preconditioned*
653matrix :math:`\kappa(M^{-1}A)`.
654
655The computational cost of using a preconditioner :math:`M` is the cost
656of computing :math:`M` and evaluating the product :math:`M^{-1}y` for
657arbitrary vectors :math:`y`. Thus, there are two competing factors to
658consider: How much of :math:`H`'s structure is captured by :math:`M`
659so that the condition number :math:`\kappa(HM^{-1})` is low, and the
660computational cost of constructing and using :math:`M`. The ideal
661preconditioner would be one for which :math:`\kappa(M^{-1}A)
662=1`. :math:`M=A` achieves this, but it is not a practical choice, as
663applying this preconditioner would require solving a linear system
664equivalent to the unpreconditioned problem. It is usually the case
665that the more information :math:`M` has about :math:`H`, the more
666expensive it is use. For example, Incomplete Cholesky factorization
667based preconditioners have much better convergence behavior than the
668Jacobi preconditioner, but are also much more expensive.
669
670
671The simplest of all preconditioners is the diagonal or Jacobi
672preconditioner, i.e., :math:`M=\operatorname{diag}(A)`, which for
673block structured matrices like :math:`H` can be generalized to the
674block Jacobi preconditioner.
675
676For ``ITERATIVE_SCHUR`` there are two obvious choices for block
677diagonal preconditioners for :math:`S`. The block diagonal of the
678matrix :math:`B` [Mandel]_ and the block diagonal :math:`S`, i.e, the
679block Jacobi preconditioner for :math:`S`. Ceres's implements both of
680these preconditioners and refers to them as ``JACOBI`` and
681``SCHUR_JACOBI`` respectively.
682
683For bundle adjustment problems arising in reconstruction from
684community photo collections, more effective preconditioners can be
685constructed by analyzing and exploiting the camera-point visibility
686structure of the scene [KushalAgarwal]. Ceres implements the two
687visibility based preconditioners described by Kushal & Agarwal as
688``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. These are fairly new
689preconditioners and Ceres' implementation of them is in its early
690stages and is not as mature as the other preconditioners described
691above.
692
693.. _section-ordering:
694
695Ordering
696--------
697
698The order in which variables are eliminated in a linear solver can
699have a significant of impact on the efficiency and accuracy of the
700method. For example when doing sparse Cholesky factorization, there
701are matrices for which a good ordering will give a Cholesky factor
702with :math:`O(n)` storage, where as a bad ordering will result in an
703completely dense factor.
704
705Ceres allows the user to provide varying amounts of hints to the
706solver about the variable elimination ordering to use. This can range
707from no hints, where the solver is free to decide the best ordering
708based on the user's choices like the linear solver being used, to an
709exact order in which the variables should be eliminated, and a variety
710of possibilities in between.
711
712Instances of the :class:`ParameterBlockOrdering` class are used to
713communicate this information to Ceres.
714
715Formally an ordering is an ordered partitioning of the parameter
716blocks. Each parameter block belongs to exactly one group, and each
717group has a unique integer associated with it, that determines its
718order in the set of groups. We call these groups *Elimination Groups*
719
720Given such an ordering, Ceres ensures that the parameter blocks in the
721lowest numbered elimination group are eliminated first, and then the
722parameter blocks in the next lowest numbered elimination group and so
723on. Within each elimination group, Ceres is free to order the
724parameter blocks as it chooses. e.g. Consider the linear system
725
726.. math::
727 x + y &= 3\\
728 2x + 3y &= 7
729
730There are two ways in which it can be solved. First eliminating
731:math:`x` from the two equations, solving for y and then back
732substituting for :math:`x`, or first eliminating :math:`y`, solving
733for :math:`x` and back substituting for :math:`y`. The user can
734construct three orderings here.
735
7361. :math:`\{0: x\}, \{1: y\}` : Eliminate :math:`x` first.
7372. :math:`\{0: y\}, \{1: x\}` : Eliminate :math:`y` first.
7383. :math:`\{0: x, y\}` : Solver gets to decide the elimination order.
739
740Thus, to have Ceres determine the ordering automatically using
741heuristics, put all the variables in the same elimination group. The
742identity of the group does not matter. This is the same as not
743specifying an ordering at all. To control the ordering for every
744variable, create an elimination group per variable, ordering them in
745the desired order.
746
747If the user is using one of the Schur solvers (``DENSE_SCHUR``,
748``SPARSE_SCHUR``, ``ITERATIVE_SCHUR``) and chooses to specify an
749ordering, it must have one important property. The lowest numbered
750elimination group must form an independent set in the graph
751corresponding to the Hessian, or in other words, no two parameter
752blocks in in the first elimination group should co-occur in the same
753residual block. For the best performance, this elimination group
754should be as large as possible. For standard bundle adjustment
755problems, this corresponds to the first elimination group containing
756all the 3d points, and the second containing the all the cameras
757parameter blocks.
758
759If the user leaves the choice to Ceres, then the solver uses an
760approximate maximum independent set algorithm to identify the first
761elimination group [LiSaad]_.
762
763.. _section-solver-options:
764
765:class:`Solver::Options`
766------------------------
767
768.. class:: Solver::Options
769
770 :class:`Solver::Options` controls the overall behavior of the
771 solver. We list the various settings and their default values below.
772
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800773
774.. member:: MinimizerType Solver::Options::minimizer_type
775
776 Default: ``TRUST_REGION``
777
778 Choose between ``LINE_SEARCH`` and ``TRUST_REGION`` algorithms. See
779 :ref:`section-trust-region-methods` and
780 :ref:`section-line-search-methods` for more details.
781
782.. member:: LineSearchDirectionType Solver::Options::line_search_direction_type
783
784 Default: ``LBFGS``
785
786 Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``
787 and ``LBFGS``.
788
789.. member:: LineSearchType Solver::Options::line_search_type
790
791 Default: ``ARMIJO``
792
793 ``ARMIJO`` is the only choice right now.
794
Sameer Agarwalfa21df82013-02-18 08:48:52 -0800795.. member:: NonlinearConjugateGradientType Solver::Options::nonlinear_conjugate_gradient_type
Sameer Agarwalfbbea462013-02-15 11:25:03 -0800796
797 Default: ``FLETCHER_REEVES``
798
799 Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIRERE`` and
800 ``HESTENES_STIEFEL``.
801
802.. member:: int Solver::Options::max_lbfs_rank
803
804 Default: 20
805
806 The LBFGS hessian approximation is a low rank approximation to the
807 inverse of the Hessian matrix. The rank of the approximation
808 determines (linearly) the space and time complexity of using the
809 approximation. Higher the rank, the better is the quality of the
810 approximation. The increase in quality is however is bounded for a
811 number of reasons.
812
813 1. The method only uses secant information and not actual
814 derivatives.
815
816 2. The Hessian approximation is constrained to be positive
817 definite.
818
819 So increasing this rank to a large number will cost time and space
820 complexity without the corresponding increase in solution
821 quality. There are no hard and fast rules for choosing the maximum
822 rank. The best choice usually requires some problem specific
823 experimentation.
824
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800825.. member:: TrustRegionStrategyType Solver::Options::trust_region_strategy_type
826
827 Default: ``LEVENBERG_MARQUARDT``
828
829 The trust region step computation algorithm used by
830 Ceres. Currently ``LEVENBERG_MARQUARDT`` and ``DOGLEG`` are the two
831 valid choices. See :ref:`section-levenberg-marquardt` and
832 :ref:`section-dogleg` for more details.
833
834.. member:: DoglegType Solver::Options::dogleg_type
835
836 Default: ``TRADITIONAL_DOGLEG``
837
838 Ceres supports two different dogleg strategies.
839 ``TRADITIONAL_DOGLEG`` method by Powell and the ``SUBSPACE_DOGLEG``
Sameer Agarwalfa21df82013-02-18 08:48:52 -0800840 method described by [ByrdSchnabel]_ . See :ref:`section-dogleg`
841 for more details.
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800842
843.. member:: bool Solver::Options::use_nonmonotonic_steps
844
845 Default: ``false``
846
847 Relax the requirement that the trust-region algorithm take strictly
848 decreasing steps. See :ref:`section-non-monotonic-steps` for more
849 details.
850
851.. member:: int Solver::Options::max_consecutive_nonmonotonic_steps
852
853 Default: ``5``
854
855 The window size used by the step selection algorithm to accept
856 non-monotonic steps.
857
858.. member:: int Solver::Options::max_num_iterations
859
860 Default: ``50``
861
862 Maximum number of iterations for which the solver should run.
863
864.. member:: double Solver::Options::max_solver_time_in_seconds
865
866 Default: ``1e6``
867 Maximum amount of time for which the solver should run.
868
869.. member:: int Solver::Options::num_threads
870
871 Default: ``1``
872
873 Number of threads used by Ceres to evaluate the Jacobian.
874
875.. member:: double Solver::Options::initial_trust_region_radius
876
877 Default: ``1e4``
878
879 The size of the initial trust region. When the
880 ``LEVENBERG_MARQUARDT`` strategy is used, the reciprocal of this
881 number is the initial regularization parameter.
882
883.. member:: double Solver::Options::max_trust_region_radius
884
885 Default: ``1e16``
886
887 The trust region radius is not allowed to grow beyond this value.
888
889.. member:: double Solver::Options::min_trust_region_radius
890
891 Default: ``1e-32``
892
893 The solver terminates, when the trust region becomes smaller than
894 this value.
895
896.. member:: double Solver::Options::min_relative_decrease
897
898 Default: ``1e-3``
899
900 Lower threshold for relative decrease before a trust-region step is
901 acceped.
902
903.. member:: double Solver::Options::lm_min_diagonal
904
905 Default: ``1e6``
906
907 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
908 regularize the the trust region step. This is the lower bound on
909 the values of this diagonal matrix.
910
911.. member:: double Solver::Options::lm_max_diagonal
912
913 Default: ``1e32``
914
915 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
916 regularize the the trust region step. This is the upper bound on
917 the values of this diagonal matrix.
918
919.. member:: int Solver::Options::max_num_consecutive_invalid_steps
920
921 Default: ``5``
922
923 The step returned by a trust region strategy can sometimes be
924 numerically invalid, usually because of conditioning
925 issues. Instead of crashing or stopping the optimization, the
926 optimizer can go ahead and try solving with a smaller trust
927 region/better conditioned problem. This parameter sets the number
928 of consecutive retries before the minimizer gives up.
929
930.. member:: double Solver::Options::function_tolerance
931
932 Default: ``1e-6``
933
934 Solver terminates if
935
936 .. math:: \frac{|\Delta \text{cost}|}{\text{cost} < \text{function_tolerance}}
937
938 where, :math:`\Delta \text{cost}` is the change in objective function
939 value (up or down) in the current iteration of Levenberg-Marquardt.
940
941.. member:: double Solver::Options::gradient_tolerance
942
943 Default: ``1e-10``
944
945 Solver terminates if
946
947 .. math:: \frac{\|g(x)\|_\infty}{\|g(x_0)\|_\infty} < \text{gradient_tolerance}
948
949 where :math:`\|\cdot\|_\infty` refers to the max norm, and :math:`x_0` is
950 the vector of initial parameter values.
951
952.. member:: double Solver::Options::parameter_tolerance
953
954 Default: ``1e-8``
955
956 Solver terminates if
957
958 .. math:: \|\Delta x\| < (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}
959
960 where :math:`\Delta x` is the step computed by the linear solver in the
961 current iteration of Levenberg-Marquardt.
962
963.. member:: LinearSolverType Solver::Options::linear_solver_type
964
965 Default: ``SPARSE_NORMAL_CHOLESKY`` / ``DENSE_QR``
966
967 Type of linear solver used to compute the solution to the linear
968 least squares problem in each iteration of the Levenberg-Marquardt
969 algorithm. If Ceres is build with ``SuiteSparse`` linked in then
970 the default is ``SPARSE_NORMAL_CHOLESKY``, it is ``DENSE_QR``
971 otherwise.
972
973.. member:: PreconditionerType Solver::Options::preconditioner_type
974
975 Default: ``JACOBI``
976
977 The preconditioner used by the iterative linear solver. The default
978 is the block Jacobi preconditioner. Valid values are (in increasing
979 order of complexity) ``IDENTITY``, ``JACOBI``, ``SCHUR_JACOBI``,
980 ``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. See
981 :ref:`section-preconditioner` for more details.
982
983.. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library
984
985 Default:``SUITE_SPARSE``
986
987 Ceres supports the use of two sparse linear algebra libraries,
988 ``SuiteSparse``, which is enabled by setting this parameter to
989 ``SUITE_SPARSE`` and ``CXSparse``, which can be selected by setting
990 this parameter to ```CX_SPARSE``. ``SuiteSparse`` is a
991 sophisticated and complex sparse linear algebra library and should
992 be used in general. If your needs/platforms prevent you from using
993 ``SuiteSparse``, consider using ``CXSparse``, which is a much
994 smaller, easier to build library. As can be expected, its
995 performance on large problems is not comparable to that of
996 ``SuiteSparse``.
997
998.. member:: int Solver::Options::num_linear_solver_threads
999
1000 Default: ``1``
1001
1002 Number of threads used by the linear solver.
1003
1004.. member:: bool Solver::Options::use_inner_iterations
1005
1006 Default: ``false``
1007
1008 Use a non-linear version of a simplified variable projection
1009 algorithm. Essentially this amounts to doing a further optimization
1010 on each Newton/Trust region step using a coordinate descent
1011 algorithm. For more details, see :ref:`section-inner-iterations`.
1012
1013.. member:: ParameterBlockOrdering* Solver::Options::inner_iteration_ordering
1014
1015 Default: ``NULL``
1016
1017 If :member:`Solver::Options::use_inner_iterations` true, then the user has
1018 two choices.
1019
1020 1. Let the solver heuristically decide which parameter blocks to
1021 optimize in each inner iteration. To do this, set
1022 :member:`Solver::Options::inner_iteration_ordering` to ``NULL``.
1023
1024 2. Specify a collection of of ordered independent sets. The lower
1025 numbered groups are optimized before the higher number groups
1026 during the inner optimization phase. Each group must be an
1027 independent set.
1028
1029 See :ref:`section-ordering` for more details.
1030
1031.. member:: ParameterBlockOrdering* Solver::Options::linear_solver_ordering
1032
1033 Default: ``NULL``
1034
1035 An instance of the ordering object informs the solver about the
1036 desired order in which parameter blocks should be eliminated by the
1037 linear solvers. See section~\ref{sec:ordering`` for more details.
1038
1039 If ``NULL``, the solver is free to choose an ordering that it
1040 thinks is best. Note: currently, this option only has an effect on
1041 the Schur type solvers, support for the ``SPARSE_NORMAL_CHOLESKY``
1042 solver is forth coming.
1043
1044 See :ref:`section-ordering` for more details.
1045
1046.. member:: bool Solver::Options::use_block_amd
1047
1048 Default: ``true``
1049
1050 By virtue of the modeling layer in Ceres being block oriented, all
1051 the matrices used by Ceres are also block oriented. When doing
1052 sparse direct factorization of these matrices, the fill-reducing
1053 ordering algorithms can either be run on the block or the scalar
1054 form of these matrices. Running it on the block form exposes more
1055 of the super-nodal structure of the matrix to the Cholesky
1056 factorization routines. This leads to substantial gains in
1057 factorization performance. Setting this parameter to true, enables
1058 the use of a block oriented Approximate Minimum Degree ordering
1059 algorithm. Settings it to ``false``, uses a scalar AMD
1060 algorithm. This option only makes sense when using
1061 :member:`Solver::Options::sparse_linear_algebra_library` = ``SUITE_SPARSE``
1062 as it uses the ``AMD`` package that is part of ``SuiteSparse``.
1063
1064.. member:: int Solver::Options::linear_solver_min_num_iterations
1065
1066 Default: ``1``
1067
1068 Minimum number of iterations used by the linear solver. This only
1069 makes sense when the linear solver is an iterative solver, e.g.,
1070 ``ITERATIVE_SCHUR`` or ``CGNR``.
1071
1072.. member:: int Solver::Options::linear_solver_max_num_iterations
1073
1074 Default: ``500``
1075
1076 Minimum number of iterations used by the linear solver. This only
1077 makes sense when the linear solver is an iterative solver, e.g.,
1078 ``ITERATIVE_SCHUR`` or ``CGNR``.
1079
1080.. member:: double Solver::Options::eta
1081
1082 Default: ``1e-1``
1083
1084 Forcing sequence parameter. The truncated Newton solver uses this
1085 number to control the relative accuracy with which the Newton step
1086 is computed. This constant is passed to
1087 ``ConjugateGradientsSolver`` which uses it to terminate the
1088 iterations when
1089
1090 .. math:: \frac{Q_i - Q_{i-1}}{Q_i} < \frac{\eta}{i}
1091
1092.. member:: bool Solver::Options::jacobi_scaling
1093
1094 Default: ``true``
1095
1096 ``true`` means that the Jacobian is scaled by the norm of its
1097 columns before being passed to the linear solver. This improves the
1098 numerical conditioning of the normal equations.
1099
1100.. member:: LoggingType Solver::Options::logging_type
1101
1102 Default: ``PER_MINIMIZER_ITERATION``
1103
1104.. member:: bool Solver::Options::minimizer_progress_to_stdout
1105
1106 Default: ``false``
1107
1108 By default the :class:`Minimizer` progress is logged to ``STDERR``
1109 depending on the ``vlog`` level. If this flag is set to true, and
1110 :member:`Solver::Options::logging_type` is not ``SILENT``, the logging
1111 output is sent to ``STDOUT``.
1112
1113.. member:: bool Solver::Options::return_initial_residuals
1114
1115 Default: ``false``
1116
1117.. member:: bool Solver::Options::return_final_residuals
1118
1119 Default: ``false``
1120
1121 If true, the vectors :member:`Solver::Summary::initial_residuals` and
1122 :member:`Solver::Summary::final_residuals` are filled with the residuals
1123 before and after the optimization. The entries of these vectors are
1124 in the order in which ResidualBlocks were added to the Problem
1125 object.
1126
1127.. member:: bool Solver::Options::return_initial_gradient
1128
1129 Default: ``false``
1130
1131.. member:: bool Solver::Options::return_final_gradient
1132
1133 Default: ``false``
1134
1135 If true, the vectors :member:`Solver::Summary::initial_gradient` and
1136 :member:`Solver::Summary::final_gradient` are filled with the gradient
1137 before and after the optimization. The entries of these vectors are
1138 in the order in which ParameterBlocks were added to the Problem
1139 object.
1140
1141 Since :member:`Problem::AddResidualBlock` adds ParameterBlocks to
1142 the :class:`Problem` automatically if they do not already exist,
1143 if you wish to have explicit control over the ordering of the
1144 vectors, then use :member:`Problem::AddParameterBlock` to
1145 explicitly add the ParameterBlocks in the order desired.
1146
1147.. member:: bool Solver::Options::return_initial_jacobian
1148
1149 Default: ``false``
1150
1151.. member:: bool Solver::Options::return_initial_jacobian
1152
1153 Default: ``false``
1154
1155 If ``true``, the Jacobian matrices before and after the
1156 optimization are returned in
1157 :member:`Solver::Summary::initial_jacobian` and
1158 :member:`Solver::Summary::final_jacobian` respectively.
1159
1160 The rows of these matrices are in the same order in which the
1161 ResidualBlocks were added to the Problem object. The columns are in
1162 the same order in which the ParameterBlocks were added to the
1163 Problem object.
1164
1165 Since :member:`Problem::AddResidualBlock` adds ParameterBlocks to
1166 the :class:`Problem` automatically if they do not already exist,
1167 if you wish to have explicit control over the ordering of the
1168 vectors, then use :member:`Problem::AddParameterBlock` to
1169 explicitly add the ParameterBlocks in the order desired.
1170
1171 The Jacobian matrices are stored as compressed row sparse
1172 matrices. Please see ``include/ceres/crs_matrix.h`` for more
1173 details of the format.
1174
1175.. member:: vector<int> Solver::Options::lsqp_iterations_to_dump
1176
1177 Default: ``empty``
1178
1179 List of iterations at which the optimizer should dump the linear
1180 least squares problem to disk. Useful for testing and
1181 benchmarking. If ``empty``, no problems are dumped.
1182
1183.. member:: string Solver::Options::lsqp_dump_directory
1184
1185 Default: ``/tmp``
1186
1187 If :member:`Solver::Options::lsqp_iterations_to_dump` is non-empty, then
1188 this setting determines the directory to which the files containing
1189 the linear least squares problems are written to.
1190
1191.. member:: DumpFormatType Solver::Options::lsqp_dump_format
1192
1193 Default: ``TEXTFILE``
1194
1195 The format in which linear least squares problems should be logged
1196 when :member:`Solver::Options::lsqp_iterations_to_dump` is non-empty.
1197 There are three options:
1198
1199 * ``CONSOLE`` prints the linear least squares problem in a human
1200 readable format to ``stderr``. The Jacobian is printed as a
1201 dense matrix. The vectors :math:`D`, :math:`x` and :math:`f` are
1202 printed as dense vectors. This should only be used for small
1203 problems.
1204
1205 * ``PROTOBUF`` Write out the linear least squares problem to the
1206 directory pointed to by :member:`Solver::Options::lsqp_dump_directory` as
1207 a protocol buffer. ``linear_least_squares_problems.h/cc``
1208 contains routines for loading these problems. For details on the
1209 on disk format used, see ``matrix.proto``. The files are named
1210 ``lm_iteration_???.lsqp``. This requires that ``protobuf`` be
1211 linked into Ceres Solver.
1212
1213 * ``TEXTFILE`` Write out the linear least squares problem to the
1214 directory pointed to by member::`Solver::Options::lsqp_dump_directory` as
1215 text files which can be read into ``MATLAB/Octave``. The Jacobian
1216 is dumped as a text file containing :math:`(i,j,s)` triplets, the
1217 vectors :math:`D`, `x` and `f` are dumped as text files
1218 containing a list of their values.
1219
1220 A ``MATLAB/Octave`` script called ``lm_iteration_???.m`` is also
1221 output, which can be used to parse and load the problem into memory.
1222
1223.. member:: bool Solver::Options::check_gradients
1224
1225 Default: ``false``
1226
1227 Check all Jacobians computed by each residual block with finite
1228 differences. This is expensive since it involves computing the
1229 derivative by normal means (e.g. user specified, autodiff, etc),
1230 then also computing it using finite differences. The results are
1231 compared, and if they differ substantially, details are printed to
1232 the log.
1233
1234.. member:: double Solver::Options::gradient_check_relative_precision
1235
1236 Default: ``1e08``
1237
1238 Precision to check for in the gradient checker. If the relative
1239 difference between an element in a Jacobian exceeds this number,
1240 then the Jacobian for that cost term is dumped.
1241
1242.. member:: double Solver::Options::numeric_derivative_relative_step_size
1243
1244 Default: ``1e-6``
1245
1246 Relative shift used for taking numeric derivatives. For finite
1247 differencing, each dimension is evaluated at slightly shifted
1248 values, e.g., for forward differences, the numerical derivative is
1249
1250 .. math::
1251
1252 \delta &= numeric\_derivative\_relative\_step\_size\\
1253 \Delta f &= \frac{f((1 + \delta) x) - f(x)}{\delta x}
1254
1255 The finite differencing is done along each dimension. The reason to
1256 use a relative (rather than absolute) step size is that this way,
1257 numeric differentiation works for functions where the arguments are
1258 typically large (e.g. :math:`10^9`) and when the values are small
1259 (e.g. :math:`10^{-5}`). It is possible to construct *torture cases*
1260 which break this finite difference heuristic, but they do not come
1261 up often in practice.
1262
1263.. member:: vector<IterationCallback> Solver::Options::callbacks
1264
1265 Callbacks that are executed at the end of each iteration of the
1266 :class:`Minimizer`. They are executed in the order that they are
1267 specified in this vector. By default, parameter blocks are updated
1268 only at the end of the optimization, i.e when the
1269 :class:`Minimizer` terminates. This behavior is controlled by
1270 :member:`Solver::Options::update_state_every_variable`. If the user wishes
1271 to have access to the update parameter blocks when his/her
1272 callbacks are executed, then set
1273 :member:`Solver::Options::update_state_every_iteration` to true.
1274
1275 The solver does NOT take ownership of these pointers.
1276
1277.. member:: bool Solver::Options::update_state_every_iteration
1278
1279 Default: ``false``
1280
1281 Normally the parameter blocks are only updated when the solver
1282 terminates. Setting this to true update them in every
1283 iteration. This setting is useful when building an interactive
1284 application using Ceres and using an :class:`IterationCallback`.
1285
1286.. member:: string Solver::Options::solver_log
1287
1288 Default: ``empty``
1289
1290 If non-empty, a summary of the execution of the solver is recorded
1291 to this file. This file is used for recording and Ceres'
1292 performance. Currently, only the iteration number, total time and
1293 the objective function value are logged. The format of this file is
1294 expected to change over time as the performance evaluation
1295 framework is fleshed out.
1296
1297:class:`ParameterBlockOrdering`
1298-------------------------------
1299
1300.. class:: ParameterBlockOrdering
1301
1302 TBD
1303
1304:class:`IterationCallback`
1305--------------------------
1306
1307.. class:: IterationCallback
1308
1309 TBD
1310
1311:class:`CRSMatrix`
1312------------------
1313
1314.. class:: CRSMatrix
1315
1316 TBD
1317
1318:class:`Solver::Summary`
1319------------------------
1320
1321.. class:: Solver::Summary
1322
1323 TBD
1324
1325:class:`GradientChecker`
1326------------------------
1327
1328.. class:: GradientChecker
1329
1330
1331
1332
1333