blob: 34c3bf8e1644222c6a4b0592b055a274dcf1b8f7 [file] [log] [blame]
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001.. default-domain:: cpp
2
Sameer Agarwal3d87b722013-02-02 00:49:31 -08003.. cpp:namespace:: ceres
4
Sameer Agarwal3d87b722013-02-02 00:49:31 -08005.. _`chapter-modeling`:
6
7============
8Modeling API
9============
10
Sameer Agarwal085cd4a2013-02-06 14:31:07 -080011Recall that Ceres solves robustified non-linear least squares problems
12of the form
Sameer Agarwal3d87b722013-02-02 00:49:31 -080013
14.. math:: \frac{1}{2}\sum_{i=1} \rho_i\left(\left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2\right).
Sameer Agarwal085cd4a2013-02-06 14:31:07 -080015 :label: ceresproblem3
Sameer Agarwal3d87b722013-02-02 00:49:31 -080016
Sameer Agarwal085cd4a2013-02-06 14:31:07 -080017The expression
Sameer Agarwal3d87b722013-02-02 00:49:31 -080018:math:`\rho_i\left(\left\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\|^2\right)`
19is known as a ``ResidualBlock``, where :math:`f_i(\cdot)` is a
20:class:`CostFunction` that depends on the parameter blocks
Sameer Agarwal085cd4a2013-02-06 14:31:07 -080021:math:`\left[x_{i_1},... , x_{i_k}\right]`. In most optimization
22problems small groups of scalars occur together. For example the three
23components of a translation vector and the four components of the
24quaternion that define the pose of a camera. We refer to such a group
25of small scalars as a ``ParameterBlock``. Of course a
26``ParameterBlock`` can just be a single parameter. :math:`\rho_i` is a
27:class:`LossFunction`. A :class:`LossFunction` is a scalar function
28that is used to reduce the influence of outliers on the solution of
29non-linear least squares problems.
30
Sameer Agarwalefb47f32013-02-14 19:44:11 -080031In this chapter we will describe the various classes that are part of
32Ceres Solver's modeling API, and how they can be used to construct
33optimization.
34
35Once a problem has been constructed, various methods for solving them
36will be discussed in :ref:`chapter-solving`. It is by design that the
37modeling and the solving APIs are orthogonal to each other. This
38enables easy switching/tweaking of various solver parameters without
39having to touch the problem once it has been successfuly modeling.
Sameer Agarwal3d87b722013-02-02 00:49:31 -080040
41:class:`CostFunction`
42---------------------
43
44.. class:: CostFunction
45
46 .. code-block:: c++
47
48 class CostFunction {
49 public:
50 virtual bool Evaluate(double const* const* parameters,
51 double* residuals,
52 double** jacobians) = 0;
53 const vector<int16>& parameter_block_sizes();
54 int num_residuals() const;
55
56 protected:
57 vector<int16>* mutable_parameter_block_sizes();
58 void set_num_residuals(int num_residuals);
59 };
60
61 Given parameter blocks :math:`\left[x_{i_1}, ... , x_{i_k}\right]`,
62 a :class:`CostFunction` is responsible for computing a vector of
63 residuals and if asked a vector of Jacobian matrices, i.e., given
64 :math:`\left[x_{i_1}, ... , x_{i_k}\right]`, compute the vector
65 :math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices
66
67 .. math:: J_{ij} = \frac{\partial}{\partial x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j \in \{i_1,..., i_k\}
68
69 The signature of the class:`CostFunction` (number and sizes of
70 input parameter blocks and number of outputs) is stored in
71 :member:`CostFunction::parameter_block_sizes_` and
72 :member:`CostFunction::num_residuals_` respectively. User code
73 inheriting from this class is expected to set these two members
74 with the corresponding accessors. This information will be verified
75 by the :class:`Problem` when added with
76 :func:`Problem::AddResidualBlock`.
77
78.. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
79
80 This is the key methods. It implements the residual and Jacobian
81 computation.
82
83 ``parameters`` is an array of pointers to arrays containing the
84 various parameter blocks. parameters has the same number of
85 elements as :member:`CostFunction::parameter_block_sizes_`.
86 Parameter blocks are in the same order as
87 :member:`CostFunction::parameter_block_sizes_`.
88
89 ``residuals`` is an array of size ``num_residuals_``.
90
91
92 ``jacobians`` is an array of size
93 :member:`CostFunction::parameter_block_sizes_` containing pointers
94 to storage for Jacobian matrices corresponding to each parameter
95 block. The Jacobian matrices are in the same order as
96 :member:`CostFunction::parameter_block_sizes_`. ``jacobians[i]`` is
97 an array that contains :member:`CostFunction::num_residuals_` x
98 :member:`CostFunction::parameter_block_sizes_` ``[i]``
99 elements. Each Jacobian matrix is stored in row-major order, i.e.,
100 ``jacobians[i][r * parameter_block_size_[i] + c]`` =
101 :math:`\frac{\partial residual[r]}{\partial parameters[i][c]}`
102
103
104 If ``jacobians`` is ``NULL``, then no derivatives are returned;
105 this is the case when computing cost only. If ``jacobians[i]`` is
106 ``NULL``, then the Jacobian matrix corresponding to the
107 :math:`i^{\textrm{th}}` parameter block must not be returned, this
108 is the case when the a parameter block is marked constant.
109
110
Sameer Agarwal3d87b722013-02-02 00:49:31 -0800111:class:`SizedCostFunction`
112--------------------------
113
114.. class:: SizedCostFunction
115
116 If the size of the parameter blocks and the size of the residual
117 vector is known at compile time (this is the common case), Ceres
118 provides :class:`SizedCostFunction`, where these values can be
119 specified as template parameters. In this case the user only needs
120 to implement the :func:`CostFunction::Evaluate`.
121
122 .. code-block:: c++
123
124 template<int kNumResiduals,
125 int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
126 int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
127 class SizedCostFunction : public CostFunction {
128 public:
129 virtual bool Evaluate(double const* const* parameters,
130 double* residuals,
131 double** jacobians) const = 0;
132 };
133
134
135:class:`AutoDiffCostFunction`
136-----------------------------
137
138.. class:: AutoDiffCostFunction
139
140 But even defining the :class:`SizedCostFunction` can be a tedious
141 affair if complicated derivative computations are involved. To this
142 end Ceres provides automatic differentiation.
143
144 To get an auto differentiated cost function, you must define a
145 class with a templated ``operator()`` (a functor) that computes the
146 cost function in terms of the template parameter ``T``. The
147 autodiff framework substitutes appropriate ``Jet`` objects for
148 ``T`` in order to compute the derivative when necessary, but this
149 is hidden, and you should write the function as if ``T`` were a
150 scalar type (e.g. a double-precision floating point number).
151
152 The function must write the computed value in the last argument
153 (the only non-``const`` one) and return true to indicate success.
154
155 For example, consider a scalar error :math:`e = k - x^\top y`,
156 where both :math:`x` and :math:`y` are two-dimensional vector
157 parameters and :math:`k` is a constant. The form of this error,
158 which is the difference between a constant and an expression, is a
159 common pattern in least squares problems. For example, the value
160 :math:`x^\top y` might be the model expectation for a series of
161 measurements, where there is an instance of the cost function for
162 each measurement :math:`k`.
163
164 The actual cost added to the total problem is :math:`e^2`, or
165 :math:`(k - x^\top y)^2`; however, the squaring is implicitly done
166 by the optimization framework.
167
168 To write an auto-differentiable cost function for the above model,
169 first define the object
170
171 .. code-block:: c++
172
173 class MyScalarCostFunctor {
174 MyScalarCostFunctor(double k): k_(k) {}
175
176 template <typename T>
177 bool operator()(const T* const x , const T* const y, T* e) const {
178 e[0] = T(k_) - x[0] * y[0] - x[1] * y[1];
179 return true;
180 }
181
182 private:
183 double k_;
184 };
185
186
187 Note that in the declaration of ``operator()`` the input parameters
188 ``x`` and ``y`` come first, and are passed as const pointers to arrays
189 of ``T``. If there were three input parameters, then the third input
190 parameter would come after ``y``. The output is always the last
191 parameter, and is also a pointer to an array. In the example above,
192 ``e`` is a scalar, so only ``e[0]`` is set.
193
194 Then given this class definition, the auto differentiated cost
195 function for it can be constructed as follows.
196
197 .. code-block:: c++
198
199 CostFunction* cost_function
200 = new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
201 new MyScalarCostFunctor(1.0)); ^ ^ ^
202 | | |
203 Dimension of residual ------+ | |
204 Dimension of x ----------------+ |
205 Dimension of y -------------------+
206
207
208 In this example, there is usually an instance for each measurement
209 of ``k``.
210
211 In the instantiation above, the template parameters following
212 ``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
213 computing a 1-dimensional output from two arguments, both
214 2-dimensional.
215
216 The framework can currently accommodate cost functions of up to 6
217 independent variables, and there is no limit on the dimensionality of
218 each of them.
219
220 **WARNING 1** Since the functor will get instantiated with
221 different types for ``T``, you must convert from other numeric
222 types to ``T`` before mixing computations with other variables
223 oftype ``T``. In the example above, this is seen where instead of
224 using ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
225
226 **WARNING 2** A common beginner's error when first using
227 :class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
228 there is a tendency to set the template parameters to (dimension of
229 residual, number of parameters) instead of passing a dimension
230 parameter for *every parameter block*. In the example above, that
231 would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
232 as the last template argument.
233
234
235:class:`NumericDiffCostFunction`
236--------------------------------
237
238.. class:: NumericDiffCostFunction
239
240 .. code-block:: c++
241
242 template <typename CostFunctionNoJacobian,
243 NumericDiffMethod method = CENTRAL, int M = 0,
244 int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
245 int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
246 class NumericDiffCostFunction
247 : public SizedCostFunction<M, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
248 };
249
250
251 Create a :class:`CostFunction` as needed by the least squares
252 framework with jacobians computed via numeric (a.k.a. finite)
253 differentiation. For more details see
254 http://en.wikipedia.org/wiki/Numerical_differentiation.
255
256 To get an numerically differentiated :class:`CostFunction`, you
257 must define a class with a ``operator()`` (a functor) that computes
258 the residuals. The functor must write the computed value in the
259 last argument (the only non-``const`` one) and return ``true`` to
260 indicate success. e.g., an object of the form
261
262 .. code-block:: c++
263
264 struct ScalarFunctor {
265 public:
266 bool operator()(const double* const x1,
267 const double* const x2,
268 double* residuals) const;
269 }
270
271 For example, consider a scalar error :math:`e = k - x'y`, where
272 both :math:`x` and :math:`y` are two-dimensional column vector
273 parameters, the prime sign indicates transposition, and :math:`k`
274 is a constant. The form of this error, which is the difference
275 between a constant and an expression, is a common pattern in least
276 squares problems. For example, the value :math:`x'y` might be the
277 model expectation for a series of measurements, where there is an
278 instance of the cost function for each measurement :math:`k`.
279
280 To write an numerically-differentiable class:`CostFunction` for the
281 above model, first define the object
282
283 .. code-block:: c++
284
285 class MyScalarCostFunctor {
286 MyScalarCostFunctor(double k): k_(k) {}
287
288 bool operator()(const double* const x,
289 const double* const y,
290 double* residuals) const {
291 residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
292 return true;
293 }
294
295 private:
296 double k_;
297 };
298
299 Note that in the declaration of ``operator()`` the input parameters
300 ``x`` and ``y`` come first, and are passed as const pointers to
301 arrays of ``double`` s. If there were three input parameters, then
302 the third input parameter would come after ``y``. The output is
303 always the last parameter, and is also a pointer to an array. In
304 the example above, the residual is a scalar, so only
305 ``residuals[0]`` is set.
306
307 Then given this class definition, the numerically differentiated
308 :class:`CostFunction` with central differences used for computing
309 the derivative can be constructed as follows.
310
311 .. code-block:: c++
312
313 CostFunction* cost_function
314 = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
315 new MyScalarCostFunctor(1.0)); ^ ^ ^
316 | | | |
317 Finite Differencing Scheme -+ | | |
318 Dimension of residual ----------+ | |
319 Dimension of x --------------------+ |
320 Dimension of y -----------------------+
321
322 In this example, there is usually an instance for each measumerent of `k`.
323
324 In the instantiation above, the template parameters following
325 ``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
326 computing a 1-dimensional output from two arguments, both
327 2-dimensional.
328
329 The framework can currently accommodate cost functions of up to 10
330 independent variables, and there is no limit on the dimensionality
331 of each of them.
332
333 The ``CENTRAL`` difference method is considerably more accurate at
334 the cost of twice as many function evaluations than forward
335 difference. Consider using central differences begin with, and only
336 after that works, trying forward difference to improve performance.
337
338 **WARNING** A common beginner's error when first using
339 NumericDiffCostFunction is to get the sizing wrong. In particular,
340 there is a tendency to set the template parameters to (dimension of
341 residual, number of parameters) instead of passing a dimension
342 parameter for *every parameter*. In the example above, that would
343 be ``<MyScalarCostFunctor, 1, 2>``, which is missing the last ``2``
344 argument. Please be careful when setting the size parameters.
345
346
347 **Alternate Interface**
348
349 For a variety of reason, including compatibility with legacy code,
350 :class:`NumericDiffCostFunction` can also take
351 :class:`CostFunction` objects as input. The following describes
352 how.
353
354 To get a numerically differentiated cost function, define a
355 subclass of :class:`CostFunction` such that the
356 :func:`CostFunction::Evaluate` function ignores the ``jacobians``
357 parameter. The numeric differentiation wrapper will fill in the
358 jacobian parameter if nececssary by repeatedly calling the
359 :func:`CostFunction::Evaluate` with small changes to the
360 appropriate parameters, and computing the slope. For performance,
361 the numeric differentiation wrapper class is templated on the
362 concrete cost function, even though it could be implemented only in
363 terms of the :class:`CostFunction` interface.
364
365 The numerically differentiated version of a cost function for a
366 cost function can be constructed as follows:
367
368 .. code-block:: c++
369
370 CostFunction* cost_function
371 = new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
372 new MyCostFunction(...), TAKE_OWNERSHIP);
373
374 where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
375 sizes 4 and 8 respectively. Look at the tests for a more detailed
376 example.
377
378
379:class:`NormalPrior`
380--------------------
381
382.. class:: NormalPrior
383
384 .. code-block:: c++
385
386 class NormalPrior: public CostFunction {
387 public:
388 // Check that the number of rows in the vector b are the same as the
389 // number of columns in the matrix A, crash otherwise.
390 NormalPrior(const Matrix& A, const Vector& b);
391
392 virtual bool Evaluate(double const* const* parameters,
393 double* residuals,
394 double** jacobians) const;
395 };
396
397 Implements a cost function of the form
398
399 .. math:: cost(x) = ||A(x - b)||^2
400
401 where, the matrix A and the vector b are fixed and x is the
402 variable. In case the user is interested in implementing a cost
403 function of the form
404
405 .. math:: cost(x) = (x - \mu)^T S^{-1} (x - \mu)
406
407 where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
408 then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
409 root of the inverse of the covariance, also known as the stiffness
410 matrix. There are however no restrictions on the shape of
411 :math:`A`. It is free to be rectangular, which would be the case if
412 the covariance matrix :math:`S` is rank deficient.
413
414
415:class:`ConditionedCostFunction`
416--------------------------------
417
418.. class:: ConditionedCostFunction
419
420 This class allows you to apply different conditioning to the residual
421 values of a wrapped cost function. An example where this is useful is
422 where you have an existing cost function that produces N values, but you
423 want the total cost to be something other than just the sum of these
424 squared values - maybe you want to apply a different scaling to some
425 values, to change their contribution to the cost.
426
427 Usage:
428
429 .. code-block:: c++
430
431 // my_cost_function produces N residuals
432 CostFunction* my_cost_function = ...
433 CHECK_EQ(N, my_cost_function->num_residuals());
434 vector<CostFunction*> conditioners;
435
436 // Make N 1x1 cost functions (1 parameter, 1 residual)
437 CostFunction* f_1 = ...
438 conditioners.push_back(f_1);
439
440 CostFunction* f_N = ...
441 conditioners.push_back(f_N);
442 ConditionedCostFunction* ccf =
443 new ConditionedCostFunction(my_cost_function, conditioners);
444
445
446 Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
447 :math:`i^{\text{th}}` conditioner.
448
449 .. code-block:: c++
450
451 ccf_residual[i] = f_i(my_cost_function_residual[i])
452
453 and the Jacobian will be affected appropriately.
454
455:class:`CostFunctionToFunctor`
456------------------------------
457
458.. class:: CostFunctionToFunctor
459
460 :class:`CostFunctionToFunctor` is an adapter class that allows users to use
461 :class:`CostFunction` objects in templated functors which are to be used for
462 automatic differentiation. This allows the user to seamlessly mix
463 analytic, numeric and automatic differentiation.
464
465 For example, let us assume that
466
467 .. code-block:: c++
468
469 class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
470 public:
471 IntrinsicProjection(const double* observations);
472 virtual bool Evaluate(double const* const* parameters,
473 double* residuals,
474 double** jacobians) const;
475 };
476
477 is a :class:`CostFunction` that implements the projection of a
478 point in its local coordinate system onto its image plane and
479 subtracts it from the observed point projection. It can compute its
480 residual and either via analytic or numerical differentiation can
481 compute its jacobians.
482
483 Now we would like to compose the action of this
484 :class:`CostFunction` with the action of camera extrinsics, i.e.,
485 rotation and translation. Say we have a templated function
486
487 .. code-block:: c++
488
489 template<typename T>
490 void RotateAndTranslatePoint(const T* rotation,
491 const T* translation,
492 const T* point,
493 T* result);
494
495
496 Then we can now do the following,
497
498 .. code-block:: c++
499
500 struct CameraProjection {
501 CameraProjection(double* observation) {
502 intrinsic_projection_.reset(
503 new CostFunctionToFunctor<2, 5, 3>(new IntrinsicProjection(observation_)));
504 }
505 template <typename T>
506 bool operator(const T* rotation,
507 const T* translation,
508 const T* intrinsics,
509 const T* point,
510 T* residual) const {
511 T transformed_point[3];
512 RotateAndTranslatePoint(rotation, translation, point, transformed_point);
513
514 // Note that we call intrinsic_projection_, just like it was
515 // any other templated functor.
516 return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
517 }
518
519 private:
520 scoped_ptr<CostFunctionToFunctor<2,5,3> > intrinsic_projection_;
521 };
522
523
524:class:`NumericDiffFunctor`
525---------------------------
526
527.. class:: NumericDiffFunctor
528
529 A wrapper class that takes a variadic functor evaluating a
530 function, numerically differentiates it and makes it available as a
531 templated functor so that it can be easily used as part of Ceres'
532 automatic differentiation framework.
533
534 For example, let us assume that
535
536 .. code-block:: c++
537
538 struct IntrinsicProjection
539 IntrinsicProjection(const double* observations);
540 bool operator()(const double* calibration,
541 const double* point,
542 double* residuals);
543 };
544
545 is a functor that implements the projection of a point in its local
546 coordinate system onto its image plane and subtracts it from the
547 observed point projection.
548
549 Now we would like to compose the action of this functor with the
550 action of camera extrinsics, i.e., rotation and translation, which
551 is given by the following templated function
552
553 .. code-block:: c++
554
555 template<typename T>
556 void RotateAndTranslatePoint(const T* rotation,
557 const T* translation,
558 const T* point,
559 T* result);
560
561 To compose the extrinsics and intrinsics, we can construct a
562 ``CameraProjection`` functor as follows.
563
564 .. code-block:: c++
565
566 struct CameraProjection {
567 typedef NumericDiffFunctor<IntrinsicProjection, CENTRAL, 2, 5, 3>
568 IntrinsicProjectionFunctor;
569
570 CameraProjection(double* observation) {
571 intrinsic_projection_.reset(
572 new IntrinsicProjectionFunctor(observation)) {
573 }
574
575 template <typename T>
576 bool operator(const T* rotation,
577 const T* translation,
578 const T* intrinsics,
579 const T* point,
580 T* residuals) const {
581 T transformed_point[3];
582 RotateAndTranslatePoint(rotation, translation, point, transformed_point);
583 return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
584 }
585
586 private:
587 scoped_ptr<IntrinsicProjectionFunctor> intrinsic_projection_;
588 };
589
590 Here, we made the choice of using ``CENTRAL`` differences to compute
591 the jacobian of ``IntrinsicProjection``.
592
593 Now, we are ready to construct an automatically differentiated cost
594 function as
595
596 .. code-block:: c++
597
598 CostFunction* cost_function =
599 new AutoDiffCostFunction<CameraProjection, 2, 3, 3, 5>(
600 new CameraProjection(observations));
601
602 ``cost_function`` now seamlessly integrates automatic
603 differentiation of ``RotateAndTranslatePoint`` with a numerically
604 differentiated version of ``IntrinsicProjection``.
605
606
607:class:`LossFunction`
608---------------------
609
610.. class:: LossFunction
611
612 For least squares problems where the minimization may encounter
613 input terms that contain outliers, that is, completely bogus
614 measurements, it is important to use a loss function that reduces
615 their influence.
616
617 Consider a structure from motion problem. The unknowns are 3D
618 points and camera parameters, and the measurements are image
619 coordinates describing the expected reprojected position for a
620 point in a camera. For example, we want to model the geometry of a
621 street scene with fire hydrants and cars, observed by a moving
622 camera with unknown parameters, and the only 3D points we care
623 about are the pointy tippy-tops of the fire hydrants. Our magic
624 image processing algorithm, which is responsible for producing the
625 measurements that are input to Ceres, has found and matched all
626 such tippy-tops in all image frames, except that in one of the
627 frame it mistook a car's headlight for a hydrant. If we didn't do
628 anything special the residual for the erroneous measurement will
629 result in the entire solution getting pulled away from the optimum
630 to reduce the large error that would otherwise be attributed to the
631 wrong measurement.
632
633 Using a robust loss function, the cost for large residuals is
634 reduced. In the example above, this leads to outlier terms getting
635 down-weighted so they do not overly influence the final solution.
636
637 .. code-block:: c++
638
639 class LossFunction {
640 public:
641 virtual void Evaluate(double s, double out[3]) const = 0;
642 };
643
644
645 The key method is :func:`LossFunction::Evaluate`, which given a
646 non-negative scalar ``s``, computes
647
648 .. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
649
650 Here the convention is that the contribution of a term to the cost
651 function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
652 =\|f_i\|^2`. Calling the method with a negative value of :math:`s`
653 is an error and the implementations are not required to handle that
654 case.
655
656 Most sane choices of :math:`\rho` satisfy:
657
658 .. math::
659
660 \rho(0) &= 0\\
661 \rho'(0) &= 1\\
662 \rho'(s) &< 1 \text{ in the outlier region}\\
663 \rho''(s) &< 0 \text{ in the outlier region}
664
665 so that they mimic the squared cost for small residuals.
666
667 **Scaling**
668
669
670 Given one robustifier :math:`\rho(s)` one can change the length
671 scale at which robustification takes place, by adding a scale
672 factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
673 a^2)` and the first and second derivatives as :math:`\rho'(s /
674 a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
675
676
677 The reason for the appearance of squaring is that :math:`a` is in
678 the units of the residual vector norm whereas :math:`s` is a squared
679 norm. For applications it is more convenient to specify :math:`a` than
680 its square.
681
682Instances
683^^^^^^^^^
684
685Ceres includes a number of other loss functions. For simplicity we
686described their unscaled versions. The figure below illustrates their
687shape graphically. More details can be found in
688``include/ceres/loss_function.h``.
689
690.. figure:: loss.png
691 :figwidth: 500px
692 :height: 400px
693 :align: center
694
695 Shape of the various common loss functions.
696
697.. class:: TrivialLoss
698
699 .. math:: \rho(s) = s
700
701.. class:: HuberLoss
702
703 .. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
704
705.. class:: SoftLOneLoss
706
707 .. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
708
709.. class:: CauchyLoss
710
711 .. math:: \rho(s) = \log(1 + s)
712
713.. class:: ArctanLoss
714
715 .. math:: \rho(s) = \arctan(s)
716
717.. class:: TolerantLoss
718
719 .. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
720
721.. class:: ComposedLoss
722
723.. class:: ScaledLoss
724
725.. class:: LossFunctionWrapper
726
727
728Theory
729^^^^^^
730
731Let us consider a problem with a single problem and a single parameter
732block.
733
734.. math::
735
736 \min_x \frac{1}{2}\rho(f^2(x))
737
738
739Then, the robustified gradient and the Gauss-Newton Hessian are
740
741.. math::
742
743 g(x) &= \rho'J^\top(x)f(x)\\
744 H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
745
746where the terms involving the second derivatives of :math:`f(x)` have
747been ignored. Note that :math:`H(x)` is indefinite if
748:math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
749the case, then its possible to re-weight the residual and the Jacobian
750matrix such that the corresponding linear least squares problem for
751the robustified Gauss-Newton step.
752
753
754Let :math:`\alpha` be a root of
755
756.. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\|f(x)\|^2 = 0.
757
758
759Then, define the rescaled residual and Jacobian as
760
761.. math::
762
763 \tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
764 \tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
765 \frac{f(x)f^\top(x)}{\left\|f(x)\right\|^2} \right)J(x)
766
767
768In the case :math:`2 \rho''\left\|f(x)\right\|^2 + \rho' \lesssim 0`,
769we limit :math:`\alpha \le 1- \epsilon` for some small
770:math:`\epsilon`. For more details see [Triggs]_.
771
772With this simple rescaling, one can use any Jacobian based non-linear
773least squares algorithm to robustifed non-linear least squares
774problems.
775
776
777:class:`LocalParameterization`
778------------------------------
779
780.. class:: LocalParameterization
781
782 .. code-block:: c++
783
784 class LocalParameterization {
785 public:
786 virtual ~LocalParameterization() {}
787 virtual bool Plus(const double* x,
788 const double* delta,
789 double* x_plus_delta) const = 0;
790 virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
791 virtual int GlobalSize() const = 0;
792 virtual int LocalSize() const = 0;
793 };
794
795 Sometimes the parameters :math:`x` can overparameterize a
796 problem. In that case it is desirable to choose a parameterization
797 to remove the null directions of the cost. More generally, if
798 :math:`x` lies on a manifold of a smaller dimension than the
799 ambient space that it is embedded in, then it is numerically and
800 computationally more effective to optimize it using a
801 parameterization that lives in the tangent space of that manifold
802 at each point.
803
804 For example, a sphere in three dimensions is a two dimensional
805 manifold, embedded in a three dimensional space. At each point on
806 the sphere, the plane tangent to it defines a two dimensional
807 tangent space. For a cost function defined on this sphere, given a
808 point :math:`x`, moving in the direction normal to the sphere at
809 that point is not useful. Thus a better way to parameterize a point
810 on a sphere is to optimize over two dimensional vector
811 :math:`\Delta x` in the tangent space at the point on the sphere
812 point and then "move" to the point :math:`x + \Delta x`, where the
813 move operation involves projecting back onto the sphere. Doing so
814 removes a redundant dimension from the optimization, making it
815 numerically more robust and efficient.
816
817 More generally we can define a function
818
819 .. math:: x' = \boxplus(x, \Delta x),
820
821 where :math:`x` has the same size as :math:`x`, and :math:`\Delta
822 x` is of size less than or equal to :math:`x`. The function
823 :math:`\boxplus`, generalizes the definition of vector
824 addition. Thus it satisfies the identity
825
826 .. math:: \boxplus(x, 0) = x,\quad \forall x.
827
828 Instances of :class:`LocalParameterization` implement the
829 :math:`\boxplus` operation and its derivative with respect to
830 :math:`\Delta x` at :math:`\Delta x = 0`.
831
832
833.. function:: int LocalParameterization::GlobalSize()
834
835 The dimension of the ambient space in which the parameter block
836 :math:`x` lives.
837
838.. function:: int LocalParamterization::LocaLocalSize()
839
840 The size of the tangent space
841 that :math:`\Delta x` lives in.
842
843.. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
844
845 :func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
846
847.. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
848
849 Computes the Jacobian matrix
850
851 .. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right|_{\Delta x = 0}
852
853 in row major form.
854
855Instances
856^^^^^^^^^
857
858.. class:: IdentityParameterization
859
860 A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
861 of the same size as :math:`x` and
862
863 .. math:: \boxplus(x, \Delta x) = x + \Delta x
864
865.. class:: SubsetParameterization
866
867 A more interesting case if :math:`x` is a two dimensional vector,
868 and the user wishes to hold the first coordinate constant. Then,
869 :math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
870
871 .. math::
872
873 \boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
874 \end{array} \right] \Delta x
875
876 :class:`SubsetParameterization` generalizes this construction to
877 hold any part of a parameter block constant.
878
879.. class:: QuaternionParameterization
880
881 Another example that occurs commonly in Structure from Motion
882 problems is when camera rotations are parameterized using a
883 quaternion. There, it is useful only to make updates orthogonal to
884 that 4-vector defining the quaternion. One way to do this is to let
885 :math:`\Delta x` be a 3 dimensional vector and define
886 :math:`\boxplus` to be
887
888 .. math:: \boxplus(x, \Delta x) = \left[ \cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x \right] * x
889 :label: quaternion
890
891 The multiplication between the two 4-vectors on the right hand side
892 is the standard quaternion
893 product. :class:`QuaternionParameterization` is an implementation
894 of :eq:`quaternion`.
895
896
897:class:`Problem`
898----------------
899
900.. class:: Problem
901
902 :class:`Problem` holds the robustified non-linear least squares
903 problem :eq:`ceresproblem`. To create a least squares problem, use
904 the :func:`Problem::AddResidualBlock` and
905 :func:`Problem::AddParameterBlock` methods.
906
907 For example a problem containing 3 parameter blocks of sizes 3, 4
908 and 5 respectively and two residual blocks of size 2 and 6:
909
910 .. code-block:: c++
911
912 double x1[] = { 1.0, 2.0, 3.0 };
913 double x2[] = { 1.0, 2.0, 3.0, 5.0 };
914 double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
915
916 Problem problem;
917 problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
918 problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
919
920 :func:`Problem::AddResidualBlock` as the name implies, adds a
921 residual block to the problem. It adds a :class:`CostFunction` , an
922 optional :class:`LossFunction` and connects the
923 :class:`CostFunction` to a set of parameter block.
924
925 The cost function carries with it information about the sizes of
926 the parameter blocks it expects. The function checks that these
927 match the sizes of the parameter blocks listed in
928 ``parameter_blocks``. The program aborts if a mismatch is
929 detected. ``loss_function`` can be ``NULL``, in which case the cost
930 of the term is just the squared norm of the residuals.
931
932 The user has the option of explicitly adding the parameter blocks
933 using :func:`Problem::AddParameterBlock`. This causes additional correctness
934 checking; however, :func:`Problem::AddResidualBlock` implicitly adds the
935 parameter blocks if they are not present, so calling
936 :func:`Problem::AddParameterBlock` explicitly is not required.
937
938
939 :class:`Problem` by default takes ownership of the ``cost_function`` and
940 ``loss_function`` pointers. These objects remain live for the life of
941 the :class:`Problem` object. If the user wishes to keep control over the
942 destruction of these objects, then they can do this by setting the
943 corresponding enums in the ``Problem::Options`` struct.
944
945
946 Note that even though the Problem takes ownership of ``cost_function``
947 and ``loss_function``, it does not preclude the user from re-using
948 them in another residual block. The destructor takes care to call
949 delete on each ``cost_function`` or ``loss_function`` pointer only
950 once, regardless of how many residual blocks refer to them.
951
952 :func:`Problem::AddParameterBlock` explicitly adds a parameter
953 block to the :class:`Problem`. Optionally it allows the user to
954 associate a :class:`LocalParameterization` object with the parameter
955 block too. Repeated calls with the same arguments are
956 ignored. Repeated calls with the same double pointer but a
957 different size results in undefined behaviour.
958
959 You can set any parameter block to be constant using
960 :func:`Problem::SetParameterBlockConstant` and undo this using
961 :func:`SetParameterBlockVariable`.
962
963 In fact you can set any number of parameter blocks to be constant,
964 and Ceres is smart enough to figure out what part of the problem
965 you have constructed depends on the parameter blocks that are free
966 to change and only spends time solving it. So for example if you
967 constructed a problem with a million parameter blocks and 2 million
968 residual blocks, but then set all but one parameter blocks to be
969 constant and say only 10 residual blocks depend on this one
970 non-constant parameter block. Then the computational effort Ceres
971 spends in solving this problem will be the same if you had defined
972 a problem with one parameter block and 10 residual blocks.
973
974 **Ownership**
975
976 :class:`Problem` by default takes ownership of the
977 ``cost_function``, ``loss_function`` and ``local_parameterization``
978 pointers. These objects remain live for the life of the
979 :class:`Problem`. If the user wishes to keep control over the
980 destruction of these objects, then they can do this by setting the
981 corresponding enums in the :class:`Problem::Options` struct.
982
983 Even though :class:`Problem` takes ownership of these pointers, it
984 does not preclude the user from re-using them in another residual
985 or parameter block. The destructor takes care to call delete on
986 each pointer only once.
987
988
989.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
990
991
992.. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
993 void Problem::AddParameterBlock(double* values, int size)
994
995
996.. function:: void Problem::SetParameterBlockConstant(double* values)
997
998.. function:: void Problem::SetParameterBlockVariable(double* values)
999
1000
1001.. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
1002
1003
1004.. function:: int Problem::NumParameterBlocks() const
1005
1006
1007.. function:: int Problem::NumParameters() const
1008
1009
1010.. function:: int Problem::NumResidualBlocks() const
1011
1012
1013.. function:: int Problem::NumResiduals() const
1014
1015
1016
1017``rotation.h``
1018--------------
1019
1020Many applications of Ceres Solver involve optimization problems where
1021some of the variables correspond to rotations. To ease the pain of
1022work with the various representations of rotations (angle-axis,
1023quaternion and matrix) we provide a handy set of templated
1024functions. These functions are templated so that the user can use them
1025within Ceres Solver's automatic differentiation framework.
1026
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001027.. function:: void AngleAxisToQuaternion<T>(T const* angle_axis, T* quaternion)
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001028
1029 Convert a value in combined axis-angle representation to a
1030 quaternion.
1031
1032 The value ``angle_axis`` is a triple whose norm is an angle in radians,
1033 and whose direction is aligned with the axis of rotation, and
1034 ``quaternion`` is a 4-tuple that will contain the resulting quaternion.
1035
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001036.. function:: void QuaternionToAngleAxis<T>(T const* quaternion, T* angle_axis)
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001037
1038 Convert a quaternion to the equivalent combined axis-angle
1039 representation.
1040
1041 The value ``quaternion`` must be a unit quaternion - it is not
1042 normalized first, and ``angle_axis`` will be filled with a value
1043 whose norm is the angle of rotation in radians, and whose direction
1044 is the axis of rotation.
1045
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001046.. function:: void RotationMatrixToAngleAxis<T>(T const * R, T * angle_axis)
1047.. function:: void AngleAxisToRotationMatrix<T>(T const * angle_axis, T * R)
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001048
1049 Conversions between 3x3 rotation matrix (in column major order) and
1050 axis-angle rotation representations.
1051
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001052.. function:: void EulerAnglesToRotationMatrix<T>(const T* euler, int row_stride, T* R)
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001053
1054 Conversions between 3x3 rotation matrix (in row major order) and
1055 Euler angle (in degrees) rotation representations.
1056
1057 The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
1058 axes, respectively. They are applied in that same order, so the
1059 total rotation R is Rz * Ry * Rx.
1060
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001061.. function:: void QuaternionToScaledRotation<T>(const T q[4], T R[3 * 3])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001062
1063 Convert a 4-vector to a 3x3 scaled rotation matrix.
1064
1065 The choice of rotation is such that the quaternion
1066 :math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
1067 matrix and for small :math:`a, b, c` the quaternion
1068 :math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
1069
1070 .. math::
1071
1072 I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
1073 \end{bmatrix} + O(q^2)
1074
1075 which corresponds to a Rodrigues approximation, the last matrix
1076 being the cross-product matrix of :math:`\begin{bmatrix} a& b&
1077 c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
1078 = R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
1079 :math:`R`.
1080
1081 The rotation matrix ``R`` is row-major.
1082
1083 No normalization of the quaternion is performed, i.e.
1084 :math:`R = \|q\|^2 Q`, where :math:`Q` is an orthonormal matrix
1085 such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
1086
1087
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001088.. function:: void QuaternionToRotation<T>(const T q[4], T R[3 * 3])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001089
1090 Same as above except that the rotation matrix is normalized by the
1091 Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
1092
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001093.. function:: void UnitQuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001094
1095 Rotates a point pt by a quaternion q:
1096
1097 .. math:: \text{result} = R(q) \text{pt}
1098
1099 Assumes the quaternion is unit norm. If you pass in a quaternion
1100 with :math:`|q|^2 = 2` then you WILL NOT get back 2 times the
1101 result you get for a unit quaternion.
1102
1103
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001104.. function:: void QuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001105
1106 With this function you do not need to assume that q has unit norm.
1107 It does assume that the norm is non-zero.
1108
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001109.. function:: void QuaternionProduct<T>(const T z[4], const T w[4], T zw[4])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001110
1111 .. math:: zw = z * w
1112
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001113 where :math:`*` is the Quaternion product between 4-vectors.
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001114
1115
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001116.. function:: void CrossProduct<T>(const T x[3], const T y[3], T x_cross_y[3])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001117
1118 .. math:: \text{x_cross_y} = x \times y
1119
Sameer Agarwal08c891f2013-02-04 20:18:58 -08001120.. function:: void AngleAxisRotatePoint<T>(const T angle_axis[3], const T pt[3], T result[3])
Sameer Agarwal3d87b722013-02-02 00:49:31 -08001121
1122 .. math:: y = R(\text{angle_axis}) x