Documentation update. Update modeling.rst and solving.rst to reflect changes to the API. Change-Id: Id1a8adfed1486f08e5fd67c5af2d29708a26490c
diff --git a/docs/source/bibliography.rst b/docs/source/bibliography.rst index 7ba435a..bd99758 100644 --- a/docs/source/bibliography.rst +++ b/docs/source/bibliography.rst
@@ -48,6 +48,12 @@ preconditioning for bundle adjustment**, *In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*, 2012. +.. [Kanzow] C. Kanzow, N. Yamashita and M. Fukushima, + **Levenberg–Marquardt methods with strong local convergence + properties for solving nonlinear equations with convex + constraints**, *Journal of Computational and Applied Mathematics*, + 177(2):375–397, 2005. + .. [Levenberg] K. Levenberg, **A method for the solution of certain nonlinear problems in least squares**, *Quart. Appl. Math*, 2(2):164–168, 1944. @@ -113,7 +119,3 @@ Levenberg Marquardt Method for Large Sparse Nonlinear Least Squares**, *Journal of the Australian Mathematical Society Series B*, 26(4):387–403, 1985. - - - -
diff --git a/docs/source/index.rst b/docs/source/index.rst index c52af14..901d51a 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst
@@ -61,7 +61,8 @@ git clone https://ceres-solver.googlesource.com/ceres-solver -* Read the :ref:`chapter-tutorial`, browse :ref:`chapter-modeling` and :ref:`chapter-solving`. +* Read the :ref:`chapter-tutorial`, browse the chapters on the + :ref:`chapter-modeling` API and the :ref:`chapter-solving` API. * Join the `mailing list <https://groups.google.com/forum/?fromgroups#!forum/ceres-solver>`_ and ask questions.
diff --git a/docs/source/modeling.rst b/docs/source/modeling.rst index 4460174..5fff351 100644 --- a/docs/source/modeling.rst +++ b/docs/source/modeling.rst
@@ -4,45 +4,67 @@ .. _`chapter-modeling`: -============ -Modeling API -============ +======== +Modeling +======== -Recall that Ceres solves robustified non-linear least squares problems -of the form +Ceres solver consists of two distinct parts. A modeling API which +provides a rich set of tools to construct an optimization problem one +term at a time and a solver API that controls the minimization +algorithm. This chapter is devoted to the task of modeling +optimization problems using Ceres. :ref:`chapter-solving` discusses +the various ways in which an optimization problem can be solved using +Ceres. -.. math:: \frac{1}{2}\sum_{i=1} \rho_i\left(\left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2\right). - :label: ceresproblem +Ceres solves robustified bounds constrained non-linear least squares +problems of the form: -The expression +.. math:: :label: ceresproblem + + \min_{\mathbf{x}} &\quad \frac{1}{2}\sum_{i} + \rho_i\left(\left\|f_i\left(x_{i_1}, + ... ,x_{i_k}\right)\right\|^2\right) \\ + \text{s.t.} &\quad l_j \le x_j \le u_j + +In Ceres parlance, the expression :math:`\rho_i\left(\left\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\|^2\right)` -is known as a ``ResidualBlock``, where :math:`f_i(\cdot)` is a -:class:`CostFunction` that depends on the parameter blocks -:math:`\left[x_{i_1},... , x_{i_k}\right]`. In most optimization -problems small groups of scalars occur together. For example the three -components of a translation vector and the four components of the -quaternion that define the pose of a camera. We refer to such a group -of small scalars as a ``ParameterBlock``. Of course a -``ParameterBlock`` can just be a single parameter. :math:`\rho_i` is a -:class:`LossFunction`. A :class:`LossFunction` is a scalar function -that is used to reduce the influence of outliers on the solution of -non-linear least squares problems. +is known as a **residual block**, where :math:`f_i(\cdot)` is a +:class:`CostFunction` that depends on the **parameter blocks** +:math:`\left\{x_{i_1},... , x_{i_k}\right\}`. -In this chapter we will describe the various classes that are part of -Ceres Solver's modeling API, and how they can be used to construct an -optimization problem. Once a problem has been constructed, various -methods for solving them will be discussed in -:ref:`chapter-solving`. It is by design that the modeling and the -solving APIs are orthogonal to each other. This enables -switching/tweaking of various solver parameters without having to -touch the problem once it has been successfully modeled. +In most optimization problems small groups of scalars occur +together. For example the three components of a translation vector and +the four components of the quaternion that define the pose of a +camera. We refer to such a group of scalars as a **parameter block**. Of +course a parameter block can be just a single scalar too. + +:math:`\rho_i` is a :class:`LossFunction`. A :class:`LossFunction` is +a scalar valued function that is used to reduce the influence of +outliers on the solution of non-linear least squares problems. + +:math:`l_j` and :math:`u_j` are lower and upper bounds on the +:math:parameter block `x_j`. + +As a special case, when :math:`\rho_i(x) = x`, i.e., the identity +function, and :math:`l_j = -\infty` and :math:`u_j = \infty` we get +the more familiar unconstrained `non-linear least squares problem +<http://en.wikipedia.org/wiki/Non-linear_least_squares>`_. + +.. math:: :label: ceresproblemunconstrained + + \frac{1}{2}\sum_{i} \left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2. :class:`CostFunction` --------------------- -The single biggest task when modeling a problem is specifying the -residuals and their derivatives. This is done using -:class:`CostFunction` objects. +For each term in the objective function, a :class:`CostFunction` is +responsible for computing a vector of residuals and if asked a vector +of Jacobian matrices, i.e., given :math:`\left[x_{i_1}, ... , +x_{i_k}\right]`, compute the vector +:math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices + + .. math:: J_{ij} = \frac{\partial}{\partial x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j \in \{i_1,..., i_k\} + .. class:: CostFunction @@ -61,22 +83,14 @@ void set_num_residuals(int num_residuals); }; - Given parameter blocks :math:`\left[x_{i_1}, ... , x_{i_k}\right]`, - a :class:`CostFunction` is responsible for computing a vector of - residuals and if asked a vector of Jacobian matrices, i.e., given - :math:`\left[x_{i_1}, ... , x_{i_k}\right]`, compute the vector - :math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices - .. math:: J_{ij} = \frac{\partial}{\partial x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j \in \{i_1,..., i_k\} - - The signature of the :class:`CostFunction` (number and sizes of - input parameter blocks and number of outputs) is stored in - :member:`CostFunction::parameter_block_sizes_` and - :member:`CostFunction::num_residuals_` respectively. User code - inheriting from this class is expected to set these two members - with the corresponding accessors. This information will be verified - by the :class:`Problem` when added with - :func:`Problem::AddResidualBlock`. +The signature of the :class:`CostFunction` (number and sizes of input +parameter blocks and number of outputs) is stored in +:member:`CostFunction::parameter_block_sizes_` and +:member:`CostFunction::num_residuals_` respectively. User code +inheriting from this class is expected to set these two members with +the corresponding accessors. This information will be verified by the +:class:`Problem` when added with :func:`Problem::AddResidualBlock`. .. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians) @@ -114,19 +128,6 @@ This can be used to communicate numerical failures in Jacobian computations for instance. - A more interesting and common use is to impose constraints on the - parameters. If the initial values of the parameter blocks satisfy - the constraints, then returning false whenever the constraints are - not satisfied will prevent the solver from moving into the - infeasible region. This is not a very sophisticated mechanism for - enforcing constraints, but is often good enough for things like - non-negativity constraints. - - Note that it is important that the initial values of the parameter - block must be feasible, otherwise the solver will declare a - numerical problem at iteration 0. - - :class:`SizedCostFunction` -------------------------- @@ -177,7 +178,12 @@ int N9 = 0> // Number of parameters in block 9. class AutoDiffCostFunction : public SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> { - }; + public: + explicit AutoDiffCostFunction(CostFunctor* functor); + // Ignore the template parameter kNumResiduals and use + // num_residuals instead. + AutoDiffCostFunction(CostFunctor* functor, int num_residuals); + } To get an auto differentiated cost function, you must define a class with a templated ``operator()`` (a functor) that computes the @@ -189,9 +195,6 @@ The function must write the computed value in the last argument (the only non-``const`` one) and return true to indicate success. - Please see :class:`CostFunction` for details on how the return - value may be used to impose simple constraints on the parameter - block. For example, consider a scalar error :math:`e = k - x^\top y`, where both :math:`x` and :math:`y` are two-dimensional vector @@ -802,6 +805,8 @@ +.. _`section-loss_function`: + :class:`LossFunction` --------------------- @@ -1228,10 +1233,10 @@ .. class:: Problem - :class:`Problem` holds the robustified non-linear least squares - problem :eq:`ceresproblem`. To create a least squares problem, use - the :func:`Problem::AddResidualBlock` and - :func:`Problem::AddParameterBlock` methods. + :class:`Problem` holds the robustified bounds constrained + non-linear least squares problem :eq:`ceresproblem`. To create a + least squares problem, use the :func:`Problem::AddResidualBlock` + and :func:`Problem::AddParameterBlock` methods. For example a problem containing 3 parameter blocks of sizes 3, 4 and 5 respectively and two residual blocks of size 2 and 6: @@ -1407,54 +1412,76 @@ parameterizations only once. The local parameterization can only be set once per parameter, and cannot be changed once set. -.. function:: int Problem::NumParameterBlocks() const +.. function LocalParameterization* Problem::GetParameterization(double* values) const; + + Get the local parameterization object associated with this + parameter block. If there is no parameterization object associated + then `NULL` is returned + +.. function void Problem::SetParameterLowerBound(double* values, int index, double lower_bound); + + Set the lower bound for the parameter at position `index` in the + parameter block corresponding to `values`. By default the lower + bound is :math:`-\infty`. + +.. function void Problem::SetParameterUpperBound(double* values, int index, double upper_bound); + + Set the upper bound for the parameter at position `index` in the + parameter block corresponding to `values`. By default the value is + :math:`\infty`. + +.. function int Problem::NumParameterBlocks() const Number of parameter blocks in the problem. Always equals parameter_blocks().size() and parameter_block_sizes().size(). -.. function:: int Problem::NumParameters() const +.. function int Problem::NumParameters() const The size of the parameter vector obtained by summing over the sizes of all the parameter blocks. -.. function:: int Problem::NumResidualBlocks() const +.. function int Problem::NumResidualBlocks() const Number of residual blocks in the problem. Always equals residual_blocks().size(). -.. function:: int Problem::NumResiduals() const +.. function int Problem::NumResiduals() const The size of the residual vector obtained by summing over the sizes of all of the residual blocks. -.. function:: int Problem::ParameterBlockSize(const double* values) const +.. function int Problem::ParameterBlockSize(const double* values) const The size of the parameter block. -.. function:: int Problem::ParameterBlockLocalSize(const double* values) const +.. function int Problem::ParameterBlockLocalSize(const double* values) const - The size of local parameterization for the parameter block. If - there is no local parameterization associated with this parameter - block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``. + The size of local parameterization for the parameter block. If + there is no local parameterization associated with this parameter + block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``. -.. function:: void Problem::GetParameterBlocks(vector<double*>* parameter_blocks) const +.. function bool Problem::HasParameterBlock(const double* values) const; + + Is the give parameter block present in the problem or not? + +.. function void Problem::GetParameterBlocks(vector<double*>* parameter_blocks) const Fills the passed ``parameter_blocks`` vector with pointers to the parameter blocks currently in the problem. After this call, ``parameter_block.size() == NumParameterBlocks``. -.. function:: void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const +.. function void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const Fills the passed `residual_blocks` vector with pointers to the residual blocks currently in the problem. After this call, `residual_blocks.size() == NumResidualBlocks`. -.. function:: void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double*>* parameter_blocks) const +.. function void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double*>* parameter_blocks) const Get all the parameter blocks that depend on the given residual block. -.. function:: void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const +.. function void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const Get all the residual blocks that depend on the given parameter block. @@ -1465,7 +1492,7 @@ blocks for a parameter block will incur a scan of the entire :class:`Problem` object. -.. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian) +.. function bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian) Evaluate a :class:`Problem`. Any of the output pointers can be `NULL`. Which residual blocks and parameter blocks are used is
diff --git a/docs/source/solving.rst b/docs/source/solving.rst index 1e11c4a..4d091a3 100644 --- a/docs/source/solving.rst +++ b/docs/source/solving.rst
@@ -5,9 +5,9 @@ .. _chapter-solving: -=========== -Solving API -=========== +======= +Solving +======= Introduction ============ @@ -23,7 +23,8 @@ :math:`m`-dimensional function of :math:`x`. We are interested in solving the following optimization problem [#f1]_ . -.. math:: \arg \min_x \frac{1}{2}\|F(x)\|^2\ . +.. math:: \arg \min_x \frac{1}{2}\|F(x)\|^2\ . \\ + L \le x \le U :label: nonlinsq Here, the Jacobian :math:`J(x)` of :math:`F(x)` is an :math:`m\times @@ -80,15 +81,20 @@ The basic trust region algorithm looks something like this. 1. Given an initial point :math:`x` and a trust region radius :math:`\mu`. - 2. :math:`\arg \min_{\Delta x} \frac{1}{2}\|J(x)\Delta - x + F(x)\|^2` s.t. :math:`\|D(x)\Delta x\|^2 \le \mu` + 2. Solve + + .. math:: + \arg \min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\ + \text{such that} &\|D(x)\Delta x\|^2 \le \mu\\ + &L \le x + \Delta x \le U. + 3. :math:`\rho = \frac{\displaystyle \|F(x + \Delta x)\|^2 - \|F(x)\|^2}{\displaystyle \|J(x)\Delta x + F(x)\|^2 - \|F(x)\|^2}` 4. if :math:`\rho > \epsilon` then :math:`x = x + \Delta x`. 5. if :math:`\rho > \eta_1` then :math:`\rho = 2 \rho` 6. else if :math:`\rho < \eta_2` then :math:`\rho = 0.5 * \rho` - 7. Goto 2. + 7. Go to 2. Here, :math:`\mu` is the trust region radius, :math:`D(x)` is some matrix used to define a metric on the domain of :math:`F(x)` and @@ -102,21 +108,27 @@ The key computational step in a trust-region algorithm is the solution of the constrained optimization problem -.. math:: \arg\min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2\quad \text{such that}\quad \|D(x)\Delta x\|^2 \le \mu +.. math:: + \arg \min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\ + \text{such that} &\|D(x)\Delta x\|^2 \le \mu\\ + &L \le x + \Delta x \le U. :label: trp There are a number of different ways of solving this problem, each giving rise to a different concrete trust-region algorithm. Currently Ceres, implements two trust-region algorithms - Levenberg-Marquardt -and Dogleg. The user can choose between them by setting -:member:`Solver::Options::trust_region_strategy_type`. +and Dogleg, each of which is augmented with a line search if bounds +constrained are present [Kanzow]_. The user can choose between them by +setting :member:`Solver::Options::trust_region_strategy_type`. .. rubric:: Footnotes -.. [#f1] At the level of the non-linear solver, the block - structure is not relevant, therefore our discussion here is - in terms of an optimization problem defined over a state - vector of size :math:`n`. +.. [#f1] At the level of the non-linear solver, the block structure is + not relevant, therefore our discussion here is in terms of an + optimization problem defined over a state vector of size + :math:`n`. Similarly the presence of loss functions is also + ignored as the problem is internally converted into a pure + non-linear least squares problem. .. _section-levenberg-marquardt: @@ -346,10 +358,9 @@ Line Search Methods =================== -**The implementation of line search algorithms in Ceres Solver is -fairly new and not very well tested, so for now this part of the -solver should be considered beta quality. We welcome reports of your -experiences both good and bad on the mailinglist.** +The line search method in Ceres Solver cannot handle bounds +constraints right now, so it can only be used for solving +unconstrained problems. Line search algorithms @@ -1208,7 +1219,7 @@ Number of threads used by the linear solver. -.. member:: ParameterBlockOrdering* Solver::Options::linear_solver_ordering +.. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::linear_solver_ordering Default: ``NULL`` @@ -1245,6 +1256,22 @@ expense of an extra copy of the Jacobian matrix. Setting ``use_postordering`` to ``true`` enables this tradeoff. +.. member:: bool Solver::Options::dynamic_sparsity + + Some non-linear least squares problems are symbolically dense but + numerically sparse. i.e. at any given state only a small number of + Jacobian entries are non-zero, but the position and number of + non-zeros is different depending on the state. For these problems + it can be useful to factorize the sparse jacobian at each solver + iteration instead of including all of the zero entries in a single + general factorization. + + If your problem does not have this property (or you do not know), + then it is probably best to keep this false, otherwise it will + likely lead to worse performance. + + This settings affects the `SPARSE_NORMAL_CHOLESKY` solver. + .. member:: int Solver::Options::min_linear_solver_iterations Default: ``1`` @@ -1304,7 +1331,7 @@ inner iterations in subsequent trust region minimizer iterations is disabled. -.. member:: ParameterBlockOrdering* Solver::Options::inner_iteration_ordering +.. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::inner_iteration_ordering Default: ``NULL`` @@ -1829,6 +1856,16 @@ A full multiline description of the state of the solver after termination. +.. function:: bool Solver::Summary::IsSolutionUsable() const + + Whether the solution returned by the optimization algorithm can be + relied on to be numerically sane. This will be the case if + `Solver::Summary:termination_type` is set to `CONVERGENCE`, + `USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver + converged by meeting one of the convergence tolerances or because + the user indicated that it had converged or it ran to the maximum + number of iterations or time. + .. member:: MinimizerType Solver::Summary::minimizer_type Type of minimization algorithm used. @@ -1841,7 +1878,6 @@ Reason why the solver terminated. - .. member:: double Solver::Summary::initial_cost Cost of the problem (value of the objective function) before the
diff --git a/docs/source/version_history.rst b/docs/source/version_history.rst index 195ca75..39f2c95 100644 --- a/docs/source/version_history.rst +++ b/docs/source/version_history.rst
@@ -10,7 +10,7 @@ New Features ------------ -#. Parameters can now have upper and/or lower bounds when using the +#. Support for bounds constrained optimization problems when using the trust region minimizer. #. Problems in which the sparsity structure of the Jacobian changes over the course of the optimization can now be solved much more @@ -34,8 +34,8 @@ parameter block is already present in the problem. #. ``Problem::GetParameterization`` can be used to access the parameterization associated with a parameter block. -#. Added the (2,4,9) template specialization for PartitionedMatrixView - and SchurEliminator. +#. Added the (2,4,9) and (2,4,8) template specialization for + PartitionedMatrixView and SchurEliminator. #. An example demonstrating the use of DynamicAutoDiffCostFunction. (Joydeep Biswas) #. An example demonstrating the use of dynamic sparsity (Richard @@ -92,6 +92,7 @@ Bug Fixes --------- +#. Fixed errant verbose levels (Bjorn Piltz) #. Variety of code cleanups, optimizations and bug fixes to the line search minimizer code (Alex Stewart) #. Fixed ``BlockSparseMatrix::Transpose`` when the matrix has row and @@ -121,7 +122,8 @@ Schönberger) #. METIS_FOUND is never set. Changed the commit to fit the setting of the other #._FOUND definitions. (Andreas Franek) -#. Variety of bug fixes to the ``CMake`` build system (Alex Stewart) +#. Variety of bug fixes and cleanups to the ``CMake`` build system + (Alex Stewart) #. Removed fictious shared library target from the NDK build. #. Solver::Options now uses ``shared_ptr`` to handle ownership of ``Solver::Options::linear_solver_ordering`` and
diff --git a/include/ceres/solver.h b/include/ceres/solver.h index 4fb6e4d..fc70073 100644 --- a/include/ceres/solver.h +++ b/include/ceres/solver.h
@@ -365,7 +365,7 @@ // Minimizer terminates when // - // max_i |gradient_i| < gradient_tolerance * max_i|initial_gradient_i| + // max_i |x - Project(Plus(x, -g(x))| < gradient_tolerance // // This value should typically be 1e-4 * function_tolerance. double gradient_tolerance;