ceres-solver / ceres-solver / e91995cce456d7edf404103bd3dc40794e13886e / . / docs / source / gradient_solver.rst

.. default-domain:: cpp | |

.. highlight:: c++ | |

.. cpp:namespace:: ceres | |

.. _chapter-gradient_problem_solver: | |

================================== | |

General Unconstrained Minimization | |

================================== | |

Modeling | |

======== | |

:class:`FirstOrderFunction` | |

--------------------------- | |

.. class:: FirstOrderFunction | |

Instances of :class:`FirstOrderFunction` implement the evaluation of | |

a function and its gradient. | |

.. code-block:: c++ | |

class FirstOrderFunction { | |

public: | |

virtual ~FirstOrderFunction() {} | |

virtual bool Evaluate(const double* const parameters, | |

double* cost, | |

double* gradient) const = 0; | |

virtual int NumParameters() const = 0; | |

}; | |

.. function:: bool FirstOrderFunction::Evaluate(const double* const parameters, double* cost, double* gradient) const | |

Evaluate the cost/value of the function. If ``gradient`` is not | |

``nullptr`` then evaluate the gradient too. If evaluation is | |

successful return, ``true`` else return ``false``. | |

``cost`` guaranteed to be never ``nullptr``, ``gradient`` can be ``nullptr``. | |

.. function:: int FirstOrderFunction::NumParameters() const | |

Number of parameters in the domain of the function. | |

:class:`GradientProblem` | |

------------------------ | |

.. NOTE:: | |

The :class:`LocalParameterization` interface and associated classes | |

are deprecated. They will be removed in the version 2.2.0. Please use | |

:class:`Manifold` based constructor instead. | |

.. class:: GradientProblem | |

.. code-block:: c++ | |

class GradientProblem { | |

public: | |

explicit GradientProblem(FirstOrderFunction* function); | |

GradientProblem(FirstOrderFunction* function, | |

LocalParameterization* parameterization); | |

GradientProblem(FirstOrderFunction* function, | |

Manifold* manifold); | |

int NumParameters() const; | |

int NumLocalParameters() const { return NumTangentParameters(); } | |

int NumTangentParameters() const; | |

bool Evaluate(const double* parameters, double* cost, double* gradient) const; | |

bool Plus(const double* x, const double* delta, double* x_plus_delta) const; | |

}; | |

Instances of :class:`GradientProblem` represent general non-linear | |

optimization problems that must be solved using just the value of the | |

objective function and its gradient. Unlike the :class:`Problem` | |

class, which can only be used to model non-linear least squares | |

problems, instances of :class:`GradientProblem` not restricted in the | |

form of the objective function. | |

Structurally :class:`GradientProblem` is a composition of a | |

:class:`FirstOrderFunction` and optionally a | |

:class:`LocalParameterization` or a :class:`Manifold`. | |

The :class:`FirstOrderFunction` is responsible for evaluating the cost | |

and gradient of the objective function. | |

The :class:`LocalParameterization`/:class:`Manifold` is responsible | |

for going back and forth between the ambient space and the local | |

tangent space. When a :class:`LocalParameterization` or a | |

:class:`Manifold` is not provided, then the tangent space is assumed | |

to coincide with the ambient Euclidean space that the gradient vector | |

lives in. | |

The constructor takes ownership of the :class:`FirstOrderFunction` and | |

:class:`LocalParameterization` or :class:`Manifold` objects passed to | |

it. | |

.. function:: void Solve(const GradientProblemSolver::Options& options, const GradientProblem& problem, double* parameters, GradientProblemSolver::Summary* summary) | |

Solve the given :class:`GradientProblem` using the values in | |

``parameters`` as the initial guess of the solution. | |

Solving | |

======= | |

:class:`GradientProblemSolver::Options` | |

--------------------------------------- | |

.. class:: GradientProblemSolver::Options | |

:class:`GradientProblemSolver::Options` controls the overall | |

behavior of the solver. We list the various settings and their | |

default values below. | |

.. function:: bool GradientProblemSolver::Options::IsValid(string* error) const | |

Validate the values in the options struct and returns true on | |

success. If there is a problem, the method returns false with | |

``error`` containing a textual description of the cause. | |

.. member:: LineSearchDirectionType GradientProblemSolver::Options::line_search_direction_type | |

Default: ``LBFGS`` | |

Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``, | |

``BFGS`` and ``LBFGS``. | |

.. member:: LineSearchType GradientProblemSolver::Options::line_search_type | |

Default: ``WOLFE`` | |

Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions). | |

Note that in order for the assumptions underlying the ``BFGS`` and | |

``LBFGS`` line search direction algorithms to be guaranteed to be | |

satisifed, the ``WOLFE`` line search should be used. | |

.. member:: NonlinearConjugateGradientType GradientProblemSolver::Options::nonlinear_conjugate_gradient_type | |

Default: ``FLETCHER_REEVES`` | |

Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and | |

``HESTENES_STIEFEL``. | |

.. member:: int GradientProblemSolver::Options::max_lbfs_rank | |

Default: 20 | |

The L-BFGS hessian approximation is a low rank approximation to the | |

inverse of the Hessian matrix. The rank of the approximation | |

determines (linearly) the space and time complexity of using the | |

approximation. Higher the rank, the better is the quality of the | |

approximation. The increase in quality is however is bounded for a | |

number of reasons. | |

1. The method only uses secant information and not actual | |

derivatives. | |

2. The Hessian approximation is constrained to be positive | |

definite. | |

So increasing this rank to a large number will cost time and space | |

complexity without the corresponding increase in solution | |

quality. There are no hard and fast rules for choosing the maximum | |

rank. The best choice usually requires some problem specific | |

experimentation. | |

.. member:: bool GradientProblemSolver::Options::use_approximate_eigenvalue_bfgs_scaling | |

Default: ``false`` | |

As part of the ``BFGS`` update step / ``LBFGS`` right-multiply | |

step, the initial inverse Hessian approximation is taken to be the | |

Identity. However, [Oren]_ showed that using instead :math:`I * | |

\gamma`, where :math:`\gamma` is a scalar chosen to approximate an | |

eigenvalue of the true inverse Hessian can result in improved | |

convergence in a wide variety of cases. Setting | |

``use_approximate_eigenvalue_bfgs_scaling`` to true enables this | |

scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each | |

iteration). | |

Precisely, approximate eigenvalue scaling equates to | |

.. math:: \gamma = \frac{y_k' s_k}{y_k' y_k} | |

With: | |

.. math:: y_k = \nabla f_{k+1} - \nabla f_k | |

.. math:: s_k = x_{k+1} - x_k | |

Where :math:`f()` is the line search objective and :math:`x` the | |

vector of parameter values [NocedalWright]_. | |

It is important to note that approximate eigenvalue scaling does | |

**not** *always* improve convergence, and that it can in fact | |

*significantly* degrade performance for certain classes of problem, | |

which is why it is disabled by default. In particular it can | |

degrade performance when the sensitivity of the problem to different | |

parameters varies significantly, as in this case a single scalar | |

factor fails to capture this variation and detrimentally downscales | |

parts of the Jacobian approximation which correspond to | |

low-sensitivity parameters. It can also reduce the robustness of the | |

solution to errors in the Jacobians. | |

.. member:: LineSearchIterpolationType GradientProblemSolver::Options::line_search_interpolation_type | |

Default: ``CUBIC`` | |

Degree of the polynomial used to approximate the objective | |

function. Valid values are ``BISECTION``, ``QUADRATIC`` and | |

``CUBIC``. | |

.. member:: double GradientProblemSolver::Options::min_line_search_step_size | |

The line search terminates if: | |

.. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size} | |

where :math:`\|\cdot\|_\infty` refers to the max norm, and | |

:math:`\Delta x_k` is the step change in the parameter values at | |

the :math:`k`-th iteration. | |

.. member:: double GradientProblemSolver::Options::line_search_sufficient_function_decrease | |

Default: ``1e-4`` | |

Solving the line search problem exactly is computationally | |

prohibitive. Fortunately, line search based optimization algorithms | |

can still guarantee convergence if instead of an exact solution, | |

the line search algorithm returns a solution which decreases the | |

value of the objective function sufficiently. More precisely, we | |

are looking for a step size s.t. | |

.. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}] | |

This condition is known as the Armijo condition. | |

.. member:: double GradientProblemSolver::Options::max_line_search_step_contraction | |

Default: ``1e-3`` | |

In each iteration of the line search, | |

.. math:: \text{new_step_size} \geq \text{max_line_search_step_contraction} * \text{step_size} | |

Note that by definition, for contraction: | |

.. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1 | |

.. member:: double GradientProblemSolver::Options::min_line_search_step_contraction | |

Default: ``0.6`` | |

In each iteration of the line search, | |

.. math:: \text{new_step_size} \leq \text{min_line_search_step_contraction} * \text{step_size} | |

Note that by definition, for contraction: | |

.. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1 | |

.. member:: int GradientProblemSolver::Options::max_num_line_search_step_size_iterations | |

Default: ``20`` | |

Maximum number of trial step size iterations during each line | |

search, if a step size satisfying the search conditions cannot be | |

found within this number of trials, the line search will stop. | |

As this is an 'artificial' constraint (one imposed by the user, not | |

the underlying math), if ``WOLFE`` line search is being used, *and* | |

points satisfying the Armijo sufficient (function) decrease | |

condition have been found during the current search (in :math:`\leq` | |

``max_num_line_search_step_size_iterations``). Then, the step size | |

with the lowest function value which satisfies the Armijo condition | |

will be returned as the new valid step, even though it does *not* | |

satisfy the strong Wolfe conditions. This behaviour protects | |

against early termination of the optimizer at a sub-optimal point. | |

.. member:: int GradientProblemSolver::Options::max_num_line_search_direction_restarts | |

Default: ``5`` | |

Maximum number of restarts of the line search direction algorithm | |

before terminating the optimization. Restarts of the line search | |

direction algorithm occur when the current algorithm fails to | |

produce a new descent direction. This typically indicates a | |

numerical failure, or a breakdown in the validity of the | |

approximations used. | |

.. member:: double GradientProblemSolver::Options::line_search_sufficient_curvature_decrease | |

Default: ``0.9`` | |

The strong Wolfe conditions consist of the Armijo sufficient | |

decrease condition, and an additional requirement that the | |

step size be chosen s.t. the *magnitude* ('strong' Wolfe | |

conditions) of the gradient along the search direction | |

decreases sufficiently. Precisely, this second condition | |

is that we seek a step size s.t. | |

.. math:: \|f'(\text{step_size})\| \leq \text{sufficient_curvature_decrease} * \|f'(0)\| | |

Where :math:`f()` is the line search objective and :math:`f'()` is the derivative | |

of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`. | |

.. member:: double GradientProblemSolver::Options::max_line_search_step_expansion | |

Default: ``10.0`` | |

During the bracketing phase of a Wolfe line search, the step size | |

is increased until either a point satisfying the Wolfe conditions | |

is found, or an upper bound for a bracket containing a point | |

satisfying the conditions is found. Precisely, at each iteration | |

of the expansion: | |

.. math:: \text{new_step_size} \leq \text{max_step_expansion} * \text{step_size} | |

By definition for expansion | |

.. math:: \text{max_step_expansion} > 1.0 | |

.. member:: int GradientProblemSolver::Options::max_num_iterations | |

Default: ``50`` | |

Maximum number of iterations for which the solver should run. | |

.. member:: double GradientProblemSolver::Options::max_solver_time_in_seconds | |

Default: ``1e6`` | |

Maximum amount of time for which the solver should run. | |

.. member:: double GradientProblemSolver::Options::function_tolerance | |

Default: ``1e-6`` | |

Solver terminates if | |

.. math:: \frac{|\Delta \text{cost}|}{\text{cost}} \leq \text{function_tolerance} | |

where, :math:`\Delta \text{cost}` is the change in objective | |

function value (up or down) in the current iteration of the line search. | |

.. member:: double GradientProblemSolver::Options::gradient_tolerance | |

Default: ``1e-10`` | |

Solver terminates if | |

.. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty \leq \text{gradient_tolerance} | |

where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi` | |

is projection onto the bounds constraints and :math:`\boxplus` is | |

Plus operation for the manifold associated with the parameter | |

vector. | |

.. member:: double GradientProblemSolver::Options::parameter_tolerance | |

Default: ``1e-8`` | |

Solver terminates if | |

.. math:: \|\Delta x\| \leq (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance} | |

where :math:`\Delta x` is the step computed by the linear solver in | |

the current iteration of the line search. | |

.. member:: LoggingType GradientProblemSolver::Options::logging_type | |

Default: ``PER_MINIMIZER_ITERATION`` | |

.. member:: bool GradientProblemSolver::Options::minimizer_progress_to_stdout | |

Default: ``false`` | |

By default the :class:`Minimizer` progress is logged to ``STDERR`` | |

depending on the ``vlog`` level. If this flag is set to true, and | |

:member:`GradientProblemSolver::Options::logging_type` is not | |

``SILENT``, the logging output is sent to ``STDOUT``. | |

The progress display looks like | |

.. code-block:: bash | |

0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02 | |

1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01 | |

2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01 | |

Here | |

#. ``f`` is the value of the objective function. | |

#. ``d`` is the change in the value of the objective function if | |

the step computed in this iteration is accepted. | |

#. ``g`` is the max norm of the gradient. | |

#. ``h`` is the change in the parameter vector. | |

#. ``s`` is the optimal step length computed by the line search. | |

#. ``it`` is the time take by the current iteration. | |

#. ``tt`` is the total time taken by the minimizer. | |

.. member:: vector<IterationCallback> GradientProblemSolver::Options::callbacks | |

Callbacks that are executed at the end of each iteration of the | |

:class:`Minimizer`. They are executed in the order that they are | |

specified in this vector. By default, parameter blocks are updated | |

only at the end of the optimization, i.e., when the | |

:class:`Minimizer` terminates. This behavior is controlled by | |

:member:`GradientProblemSolver::Options::update_state_every_variable`. If | |

the user wishes to have access to the update parameter blocks when | |

his/her callbacks are executed, then set | |

:member:`GradientProblemSolver::Options::update_state_every_iteration` | |

to true. | |

The solver does NOT take ownership of these pointers. | |

.. member:: bool Solver::Options::update_state_every_iteration | |

Default: ``false`` | |

Normally the parameter vector is only updated when the solver | |

terminates. Setting this to true updates it every iteration. This | |

setting is useful when building an interactive application using | |

Ceres and using an :class:`IterationCallback`. | |

:class:`GradientProblemSolver::Summary` | |

--------------------------------------- | |

.. class:: GradientProblemSolver::Summary | |

Summary of the various stages of the solver after termination. | |

.. function:: string GradientProblemSolver::Summary::BriefReport() const | |

A brief one line description of the state of the solver after | |

termination. | |

.. function:: string GradientProblemSolver::Summary::FullReport() const | |

A full multiline description of the state of the solver after | |

termination. | |

.. function:: bool GradientProblemSolver::Summary::IsSolutionUsable() const | |

Whether the solution returned by the optimization algorithm can be | |

relied on to be numerically sane. This will be the case if | |

`GradientProblemSolver::Summary:termination_type` is set to `CONVERGENCE`, | |

`USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver | |

converged by meeting one of the convergence tolerances or because | |

the user indicated that it had converged or it ran to the maximum | |

number of iterations or time. | |

.. member:: TerminationType GradientProblemSolver::Summary::termination_type | |

The cause of the minimizer terminating. | |

.. member:: string GradientProblemSolver::Summary::message | |

Reason why the solver terminated. | |

.. member:: double GradientProblemSolver::Summary::initial_cost | |

Cost of the problem (value of the objective function) before the | |

optimization. | |

.. member:: double GradientProblemSolver::Summary::final_cost | |

Cost of the problem (value of the objective function) after the | |

optimization. | |

.. member:: vector<IterationSummary> GradientProblemSolver::Summary::iterations | |

:class:`IterationSummary` for each minimizer iteration in order. | |

.. member:: int num_cost_evaluations | |

Number of times the cost (and not the gradient) was evaluated. | |

.. member:: int num_gradient_evaluations | |

Number of times the gradient (and the cost) were evaluated. | |

.. member:: double GradientProblemSolver::Summary::total_time_in_seconds | |

Time (in seconds) spent in the solver. | |

.. member:: double GradientProblemSolver::Summary::cost_evaluation_time_in_seconds | |

Time (in seconds) spent evaluating the cost vector. | |

.. member:: double GradientProblemSolver::Summary::gradient_evaluation_time_in_seconds | |

Time (in seconds) spent evaluating the gradient vector. | |

.. member:: int GradientProblemSolver::Summary::num_parameters | |

Number of parameters in the problem. | |

.. member:: int GradientProblemSolver::Summary::num_local_parameters | |

Dimension of the tangent space of the problem. This is different | |

from :member:`GradientProblemSolver::Summary::num_parameters` if a | |

:class:`LocalParameterization`/:class:`Manifold` object is used. | |

.. NOTE:: | |

``num_local_parameters`` is deprecated and will be removed in | |

Ceres Solver version 2.2.0. Please use ``num_tangent_parameters`` | |

instead. | |

.. member:: int GradientProblemSolver::Summary::num_tangent_parameters | |

Dimension of the tangent space of the problem. This is different | |

from :member:`GradientProblemSolver::Summary::num_parameters` if a | |

:class:`LocalParameterization`/:class:`Manifold` object is used. | |

.. member:: LineSearchDirectionType GradientProblemSolver::Summary::line_search_direction_type | |

Type of line search direction used. | |

.. member:: LineSearchType GradientProblemSolver::Summary::line_search_type | |

Type of the line search algorithm used. | |

.. member:: LineSearchInterpolationType GradientProblemSolver::Summary::line_search_interpolation_type | |

When performing line search, the degree of the polynomial used to | |

approximate the objective function. | |

.. member:: NonlinearConjugateGradientType GradientProblemSolver::Summary::nonlinear_conjugate_gradient_type | |

If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`, | |

then this indicates the particular variant of non-linear conjugate | |

gradient used. | |

.. member:: int GradientProblemSolver::Summary::max_lbfgs_rank | |

If the type of the line search direction is `LBFGS`, then this | |

indicates the rank of the Hessian approximation. |