Blame - docs/source/modeling.rst - ceres-solver

blob: 34c3bf8e1644222c6a4b0592b055a274dcf1b8f7 [file] [log] [blame]

Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1	.. default-domain:: cpp
				2
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	3	.. cpp:namespace:: ceres
				4
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	5	.. _`chapter-modeling`:
				6
				7	============
				8	Modeling API
				9	============
				10
Sameer Agarwal	085cd4a	2013-02-06 14:31:07 -0800	[diff] [blame]	11	Recall that Ceres solves robustified non-linear least squares problems
				12	of the form
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	13
				14	.. math:: \frac{1}{2}\sum_{i=1} \rho_i\left(\left\\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\\|^2\right).
Sameer Agarwal	085cd4a	2013-02-06 14:31:07 -0800	[diff] [blame]	15	:label: ceresproblem3
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	16
Sameer Agarwal	085cd4a	2013-02-06 14:31:07 -0800	[diff] [blame]	17	The expression
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	18	:math:`\rho_i\left(\left\\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\\|^2\right)`
				19	is known as a ``ResidualBlock``, where :math:`f_i(\cdot)` is a
				20	:class:`CostFunction` that depends on the parameter blocks
Sameer Agarwal	085cd4a	2013-02-06 14:31:07 -0800	[diff] [blame]	21	:math:`\left[x_{i_1},... , x_{i_k}\right]`. In most optimization
				22	problems small groups of scalars occur together. For example the three
				23	components of a translation vector and the four components of the
				24	quaternion that define the pose of a camera. We refer to such a group
				25	of small scalars as a ``ParameterBlock``. Of course a
				26	``ParameterBlock`` can just be a single parameter. :math:`\rho_i` is a
				27	:class:`LossFunction`. A :class:`LossFunction` is a scalar function
				28	that is used to reduce the influence of outliers on the solution of
				29	non-linear least squares problems.
				30
Sameer Agarwal	efb47f3	2013-02-14 19:44:11 -0800	[diff] [blame^]	31	In this chapter we will describe the various classes that are part of
				32	Ceres Solver's modeling API, and how they can be used to construct
				33	optimization.
				34
				35	Once a problem has been constructed, various methods for solving them
				36	will be discussed in :ref:`chapter-solving`. It is by design that the
				37	modeling and the solving APIs are orthogonal to each other. This
				38	enables easy switching/tweaking of various solver parameters without
				39	having to touch the problem once it has been successfuly modeling.
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	40
				41	:class:`CostFunction`
				42	---------------------
				43
				44	.. class:: CostFunction
				45
				46	.. code-block:: c++
				47
				48	class CostFunction {
				49	public:
				50	virtual bool Evaluate(double const* const* parameters,
				51	double* residuals,
				52	double** jacobians) = 0;
				53	const vector<int16>& parameter_block_sizes();
				54	int num_residuals() const;
				55
				56	protected:
				57	vector<int16>* mutable_parameter_block_sizes();
				58	void set_num_residuals(int num_residuals);
				59	};
				60
				61	Given parameter blocks :math:`\left[x_{i_1}, ... , x_{i_k}\right]`,
				62	a :class:`CostFunction` is responsible for computing a vector of
				63	residuals and if asked a vector of Jacobian matrices, i.e., given
				64	:math:`\left[x_{i_1}, ... , x_{i_k}\right]`, compute the vector
				65	:math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices
				66
				67	.. math:: J_{ij} = \frac{\partial}{\partial x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j \in \{i_1,..., i_k\}
				68
				69	The signature of the class:`CostFunction` (number and sizes of
				70	input parameter blocks and number of outputs) is stored in
				71	:member:`CostFunction::parameter_block_sizes_` and
				72	:member:`CostFunction::num_residuals_` respectively. User code
				73	inheriting from this class is expected to set these two members
				74	with the corresponding accessors. This information will be verified
				75	by the :class:`Problem` when added with
				76	:func:`Problem::AddResidualBlock`.
				77
				78	.. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
				79
				80	This is the key methods. It implements the residual and Jacobian
				81	computation.
				82
				83	``parameters`` is an array of pointers to arrays containing the
				84	various parameter blocks. parameters has the same number of
				85	elements as :member:`CostFunction::parameter_block_sizes_`.
				86	Parameter blocks are in the same order as
				87	:member:`CostFunction::parameter_block_sizes_`.
				88
				89	``residuals`` is an array of size ``num_residuals_``.
				90
				91
				92	``jacobians`` is an array of size
				93	:member:`CostFunction::parameter_block_sizes_` containing pointers
				94	to storage for Jacobian matrices corresponding to each parameter
				95	block. The Jacobian matrices are in the same order as
				96	:member:`CostFunction::parameter_block_sizes_`. ``jacobians[i]`` is
				97	an array that contains :member:`CostFunction::num_residuals_` x
				98	:member:`CostFunction::parameter_block_sizes_` ``[i]``
				99	elements. Each Jacobian matrix is stored in row-major order, i.e.,
				100	``jacobians[i][r * parameter_block_size_[i] + c]`` =
				101	:math:`\frac{\partial residual[r]}{\partial parameters[i][c]}`
				102
				103
				104	If ``jacobians`` is ``NULL``, then no derivatives are returned;
				105	this is the case when computing cost only. If ``jacobians[i]`` is
				106	``NULL``, then the Jacobian matrix corresponding to the
				107	:math:`i^{\textrm{th}}` parameter block must not be returned, this
				108	is the case when the a parameter block is marked constant.
				109
				110
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	111	:class:`SizedCostFunction`
				112	--------------------------
				113
				114	.. class:: SizedCostFunction
				115
				116	If the size of the parameter blocks and the size of the residual
				117	vector is known at compile time (this is the common case), Ceres
				118	provides :class:`SizedCostFunction`, where these values can be
				119	specified as template parameters. In this case the user only needs
				120	to implement the :func:`CostFunction::Evaluate`.
				121
				122	.. code-block:: c++
				123
				124	template<int kNumResiduals,
				125	int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
				126	int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
				127	class SizedCostFunction : public CostFunction {
				128	public:
				129	virtual bool Evaluate(double const* const* parameters,
				130	double* residuals,
				131	double** jacobians) const = 0;
				132	};
				133
				134
				135	:class:`AutoDiffCostFunction`
				136	-----------------------------
				137
				138	.. class:: AutoDiffCostFunction
				139
				140	But even defining the :class:`SizedCostFunction` can be a tedious
				141	affair if complicated derivative computations are involved. To this
				142	end Ceres provides automatic differentiation.
				143
				144	To get an auto differentiated cost function, you must define a
				145	class with a templated ``operator()`` (a functor) that computes the
				146	cost function in terms of the template parameter ``T``. The
				147	autodiff framework substitutes appropriate ``Jet`` objects for
				148	``T`` in order to compute the derivative when necessary, but this
				149	is hidden, and you should write the function as if ``T`` were a
				150	scalar type (e.g. a double-precision floating point number).
				151
				152	The function must write the computed value in the last argument
				153	(the only non-``const`` one) and return true to indicate success.
				154
				155	For example, consider a scalar error :math:`e = k - x^\top y`,
				156	where both :math:`x` and :math:`y` are two-dimensional vector
				157	parameters and :math:`k` is a constant. The form of this error,
				158	which is the difference between a constant and an expression, is a
				159	common pattern in least squares problems. For example, the value
				160	:math:`x^\top y` might be the model expectation for a series of
				161	measurements, where there is an instance of the cost function for
				162	each measurement :math:`k`.
				163
				164	The actual cost added to the total problem is :math:`e^2`, or
				165	:math:`(k - x^\top y)^2`; however, the squaring is implicitly done
				166	by the optimization framework.
				167
				168	To write an auto-differentiable cost function for the above model,
				169	first define the object
				170
				171	.. code-block:: c++
				172
				173	class MyScalarCostFunctor {
				174	MyScalarCostFunctor(double k): k_(k) {}
				175
				176	template <typename T>
				177	bool operator()(const T* const x , const T* const y, T* e) const {
				178	e[0] = T(k_) - x[0] * y[0] - x[1] * y[1];
				179	return true;
				180	}
				181
				182	private:
				183	double k_;
				184	};
				185
				186
				187	Note that in the declaration of ``operator()`` the input parameters
				188	``x`` and ``y`` come first, and are passed as const pointers to arrays
				189	of ``T``. If there were three input parameters, then the third input
				190	parameter would come after ``y``. The output is always the last
				191	parameter, and is also a pointer to an array. In the example above,
				192	``e`` is a scalar, so only ``e[0]`` is set.
				193
				194	Then given this class definition, the auto differentiated cost
				195	function for it can be constructed as follows.
				196
				197	.. code-block:: c++
				198
				199	CostFunction* cost_function
				200	= new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
				201	new MyScalarCostFunctor(1.0)); ^ ^ ^
				202	\| \| \|
				203	Dimension of residual ------+ \| \|
				204	Dimension of x ----------------+ \|
				205	Dimension of y -------------------+
				206
				207
				208	In this example, there is usually an instance for each measurement
				209	of ``k``.
				210
				211	In the instantiation above, the template parameters following
				212	``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
				213	computing a 1-dimensional output from two arguments, both
				214	2-dimensional.
				215
				216	The framework can currently accommodate cost functions of up to 6
				217	independent variables, and there is no limit on the dimensionality of
				218	each of them.
				219
				220	WARNING 1 Since the functor will get instantiated with
				221	different types for ``T``, you must convert from other numeric
				222	types to ``T`` before mixing computations with other variables
				223	oftype ``T``. In the example above, this is seen where instead of
				224	using ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
				225
				226	WARNING 2 A common beginner's error when first using
				227	:class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
				228	there is a tendency to set the template parameters to (dimension of
				229	residual, number of parameters) instead of passing a dimension
				230	parameter for every parameter block. In the example above, that
				231	would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
				232	as the last template argument.
				233
				234
				235	:class:`NumericDiffCostFunction`
				236	--------------------------------
				237
				238	.. class:: NumericDiffCostFunction
				239
				240	.. code-block:: c++
				241
				242	template <typename CostFunctionNoJacobian,
				243	NumericDiffMethod method = CENTRAL, int M = 0,
				244	int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
				245	int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
				246	class NumericDiffCostFunction
				247	: public SizedCostFunction<M, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
				248	};
				249
				250
				251	Create a :class:`CostFunction` as needed by the least squares
				252	framework with jacobians computed via numeric (a.k.a. finite)
				253	differentiation. For more details see
				254	http://en.wikipedia.org/wiki/Numerical_differentiation.
				255
				256	To get an numerically differentiated :class:`CostFunction`, you
				257	must define a class with a ``operator()`` (a functor) that computes
				258	the residuals. The functor must write the computed value in the
				259	last argument (the only non-``const`` one) and return ``true`` to
				260	indicate success. e.g., an object of the form
				261
				262	.. code-block:: c++
				263
				264	struct ScalarFunctor {
				265	public:
				266	bool operator()(const double* const x1,
				267	const double* const x2,
				268	double* residuals) const;
				269	}
				270
				271	For example, consider a scalar error :math:`e = k - x'y`, where
				272	both :math:`x` and :math:`y` are two-dimensional column vector
				273	parameters, the prime sign indicates transposition, and :math:`k`
				274	is a constant. The form of this error, which is the difference
				275	between a constant and an expression, is a common pattern in least
				276	squares problems. For example, the value :math:`x'y` might be the
				277	model expectation for a series of measurements, where there is an
				278	instance of the cost function for each measurement :math:`k`.
				279
				280	To write an numerically-differentiable class:`CostFunction` for the
				281	above model, first define the object
				282
				283	.. code-block:: c++
				284
				285	class MyScalarCostFunctor {
				286	MyScalarCostFunctor(double k): k_(k) {}
				287
				288	bool operator()(const double* const x,
				289	const double* const y,
				290	double* residuals) const {
				291	residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
				292	return true;
				293	}
				294
				295	private:
				296	double k_;
				297	};
				298
				299	Note that in the declaration of ``operator()`` the input parameters
				300	``x`` and ``y`` come first, and are passed as const pointers to
				301	arrays of ``double`` s. If there were three input parameters, then
				302	the third input parameter would come after ``y``. The output is
				303	always the last parameter, and is also a pointer to an array. In
				304	the example above, the residual is a scalar, so only
				305	``residuals[0]`` is set.
				306
				307	Then given this class definition, the numerically differentiated
				308	:class:`CostFunction` with central differences used for computing
				309	the derivative can be constructed as follows.
				310
				311	.. code-block:: c++
				312
				313	CostFunction* cost_function
				314	= new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
				315	new MyScalarCostFunctor(1.0)); ^ ^ ^
				316	\| \| \| \|
				317	Finite Differencing Scheme -+ \| \| \|
				318	Dimension of residual ----------+ \| \|
				319	Dimension of x --------------------+ \|
				320	Dimension of y -----------------------+
				321
				322	In this example, there is usually an instance for each measumerent of `k`.
				323
				324	In the instantiation above, the template parameters following
				325	``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
				326	computing a 1-dimensional output from two arguments, both
				327	2-dimensional.
				328
				329	The framework can currently accommodate cost functions of up to 10
				330	independent variables, and there is no limit on the dimensionality
				331	of each of them.
				332
				333	The ``CENTRAL`` difference method is considerably more accurate at
				334	the cost of twice as many function evaluations than forward
				335	difference. Consider using central differences begin with, and only
				336	after that works, trying forward difference to improve performance.
				337
				338	WARNING A common beginner's error when first using
				339	NumericDiffCostFunction is to get the sizing wrong. In particular,
				340	there is a tendency to set the template parameters to (dimension of
				341	residual, number of parameters) instead of passing a dimension
				342	parameter for every parameter. In the example above, that would
				343	be ``<MyScalarCostFunctor, 1, 2>``, which is missing the last ``2``
				344	argument. Please be careful when setting the size parameters.
				345
				346
				347	Alternate Interface
				348
				349	For a variety of reason, including compatibility with legacy code,
				350	:class:`NumericDiffCostFunction` can also take
				351	:class:`CostFunction` objects as input. The following describes
				352	how.
				353
				354	To get a numerically differentiated cost function, define a
				355	subclass of :class:`CostFunction` such that the
				356	:func:`CostFunction::Evaluate` function ignores the ``jacobians``
				357	parameter. The numeric differentiation wrapper will fill in the
				358	jacobian parameter if nececssary by repeatedly calling the
				359	:func:`CostFunction::Evaluate` with small changes to the
				360	appropriate parameters, and computing the slope. For performance,
				361	the numeric differentiation wrapper class is templated on the
				362	concrete cost function, even though it could be implemented only in
				363	terms of the :class:`CostFunction` interface.
				364
				365	The numerically differentiated version of a cost function for a
				366	cost function can be constructed as follows:
				367
				368	.. code-block:: c++
				369
				370	CostFunction* cost_function
				371	= new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
				372	new MyCostFunction(...), TAKE_OWNERSHIP);
				373
				374	where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
				375	sizes 4 and 8 respectively. Look at the tests for a more detailed
				376	example.
				377
				378
				379	:class:`NormalPrior`
				380	--------------------
				381
				382	.. class:: NormalPrior
				383
				384	.. code-block:: c++
				385
				386	class NormalPrior: public CostFunction {
				387	public:
				388	// Check that the number of rows in the vector b are the same as the
				389	// number of columns in the matrix A, crash otherwise.
				390	NormalPrior(const Matrix& A, const Vector& b);
				391
				392	virtual bool Evaluate(double const* const* parameters,
				393	double* residuals,
				394	double** jacobians) const;
				395	};
				396
				397	Implements a cost function of the form
				398
				399	.. math:: cost(x) = \|\|A(x - b)\|\|^2
				400
				401	where, the matrix A and the vector b are fixed and x is the
				402	variable. In case the user is interested in implementing a cost
				403	function of the form
				404
				405	.. math:: cost(x) = (x - \mu)^T S^{-1} (x - \mu)
				406
				407	where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
				408	then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
				409	root of the inverse of the covariance, also known as the stiffness
				410	matrix. There are however no restrictions on the shape of
				411	:math:`A`. It is free to be rectangular, which would be the case if
				412	the covariance matrix :math:`S` is rank deficient.
				413
				414
				415	:class:`ConditionedCostFunction`
				416	--------------------------------
				417
				418	.. class:: ConditionedCostFunction
				419
				420	This class allows you to apply different conditioning to the residual
				421	values of a wrapped cost function. An example where this is useful is
				422	where you have an existing cost function that produces N values, but you
				423	want the total cost to be something other than just the sum of these
				424	squared values - maybe you want to apply a different scaling to some
				425	values, to change their contribution to the cost.
				426
				427	Usage:
				428
				429	.. code-block:: c++
				430
				431	// my_cost_function produces N residuals
				432	CostFunction* my_cost_function = ...
				433	CHECK_EQ(N, my_cost_function->num_residuals());
				434	vector<CostFunction*> conditioners;
				435
				436	// Make N 1x1 cost functions (1 parameter, 1 residual)
				437	CostFunction* f_1 = ...
				438	conditioners.push_back(f_1);
				439
				440	CostFunction* f_N = ...
				441	conditioners.push_back(f_N);
				442	ConditionedCostFunction* ccf =
				443	new ConditionedCostFunction(my_cost_function, conditioners);
				444
				445
				446	Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
				447	:math:`i^{\text{th}}` conditioner.
				448
				449	.. code-block:: c++
				450
				451	ccf_residual[i] = f_i(my_cost_function_residual[i])
				452
				453	and the Jacobian will be affected appropriately.
				454
				455	:class:`CostFunctionToFunctor`
				456	------------------------------
				457
				458	.. class:: CostFunctionToFunctor
				459
				460	:class:`CostFunctionToFunctor` is an adapter class that allows users to use
				461	:class:`CostFunction` objects in templated functors which are to be used for
				462	automatic differentiation. This allows the user to seamlessly mix
				463	analytic, numeric and automatic differentiation.
				464
				465	For example, let us assume that
				466
				467	.. code-block:: c++
				468
				469	class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
				470	public:
				471	IntrinsicProjection(const double* observations);
				472	virtual bool Evaluate(double const* const* parameters,
				473	double* residuals,
				474	double** jacobians) const;
				475	};
				476
				477	is a :class:`CostFunction` that implements the projection of a
				478	point in its local coordinate system onto its image plane and
				479	subtracts it from the observed point projection. It can compute its
				480	residual and either via analytic or numerical differentiation can
				481	compute its jacobians.
				482
				483	Now we would like to compose the action of this
				484	:class:`CostFunction` with the action of camera extrinsics, i.e.,
				485	rotation and translation. Say we have a templated function
				486
				487	.. code-block:: c++
				488
				489	template<typename T>
				490	void RotateAndTranslatePoint(const T* rotation,
				491	const T* translation,
				492	const T* point,
				493	T* result);
				494
				495
				496	Then we can now do the following,
				497
				498	.. code-block:: c++
				499
				500	struct CameraProjection {
				501	CameraProjection(double* observation) {
				502	intrinsic_projection_.reset(
				503	new CostFunctionToFunctor<2, 5, 3>(new IntrinsicProjection(observation_)));
				504	}
				505	template <typename T>
				506	bool operator(const T* rotation,
				507	const T* translation,
				508	const T* intrinsics,
				509	const T* point,
				510	T* residual) const {
				511	T transformed_point[3];
				512	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				513
				514	// Note that we call intrinsic_projection_, just like it was
				515	// any other templated functor.
				516	return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
				517	}
				518
				519	private:
				520	scoped_ptr<CostFunctionToFunctor<2,5,3> > intrinsic_projection_;
				521	};
				522
				523
				524	:class:`NumericDiffFunctor`
				525	---------------------------
				526
				527	.. class:: NumericDiffFunctor
				528
				529	A wrapper class that takes a variadic functor evaluating a
				530	function, numerically differentiates it and makes it available as a
				531	templated functor so that it can be easily used as part of Ceres'
				532	automatic differentiation framework.
				533
				534	For example, let us assume that
				535
				536	.. code-block:: c++
				537
				538	struct IntrinsicProjection
				539	IntrinsicProjection(const double* observations);
				540	bool operator()(const double* calibration,
				541	const double* point,
				542	double* residuals);
				543	};
				544
				545	is a functor that implements the projection of a point in its local
				546	coordinate system onto its image plane and subtracts it from the
				547	observed point projection.
				548
				549	Now we would like to compose the action of this functor with the
				550	action of camera extrinsics, i.e., rotation and translation, which
				551	is given by the following templated function
				552
				553	.. code-block:: c++
				554
				555	template<typename T>
				556	void RotateAndTranslatePoint(const T* rotation,
				557	const T* translation,
				558	const T* point,
				559	T* result);
				560
				561	To compose the extrinsics and intrinsics, we can construct a
				562	``CameraProjection`` functor as follows.
				563
				564	.. code-block:: c++
				565
				566	struct CameraProjection {
				567	typedef NumericDiffFunctor<IntrinsicProjection, CENTRAL, 2, 5, 3>
				568	IntrinsicProjectionFunctor;
				569
				570	CameraProjection(double* observation) {
				571	intrinsic_projection_.reset(
				572	new IntrinsicProjectionFunctor(observation)) {
				573	}
				574
				575	template <typename T>
				576	bool operator(const T* rotation,
				577	const T* translation,
				578	const T* intrinsics,
				579	const T* point,
				580	T* residuals) const {
				581	T transformed_point[3];
				582	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				583	return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
				584	}
				585
				586	private:
				587	scoped_ptr<IntrinsicProjectionFunctor> intrinsic_projection_;
				588	};
				589
				590	Here, we made the choice of using ``CENTRAL`` differences to compute
				591	the jacobian of ``IntrinsicProjection``.
				592
				593	Now, we are ready to construct an automatically differentiated cost
				594	function as
				595
				596	.. code-block:: c++
				597
				598	CostFunction* cost_function =
				599	new AutoDiffCostFunction<CameraProjection, 2, 3, 3, 5>(
				600	new CameraProjection(observations));
				601
				602	``cost_function`` now seamlessly integrates automatic
				603	differentiation of ``RotateAndTranslatePoint`` with a numerically
				604	differentiated version of ``IntrinsicProjection``.
				605
				606
				607	:class:`LossFunction`
				608	---------------------
				609
				610	.. class:: LossFunction
				611
				612	For least squares problems where the minimization may encounter
				613	input terms that contain outliers, that is, completely bogus
				614	measurements, it is important to use a loss function that reduces
				615	their influence.
				616
				617	Consider a structure from motion problem. The unknowns are 3D
				618	points and camera parameters, and the measurements are image
				619	coordinates describing the expected reprojected position for a
				620	point in a camera. For example, we want to model the geometry of a
				621	street scene with fire hydrants and cars, observed by a moving
				622	camera with unknown parameters, and the only 3D points we care
				623	about are the pointy tippy-tops of the fire hydrants. Our magic
				624	image processing algorithm, which is responsible for producing the
				625	measurements that are input to Ceres, has found and matched all
				626	such tippy-tops in all image frames, except that in one of the
				627	frame it mistook a car's headlight for a hydrant. If we didn't do
				628	anything special the residual for the erroneous measurement will
				629	result in the entire solution getting pulled away from the optimum
				630	to reduce the large error that would otherwise be attributed to the
				631	wrong measurement.
				632
				633	Using a robust loss function, the cost for large residuals is
				634	reduced. In the example above, this leads to outlier terms getting
				635	down-weighted so they do not overly influence the final solution.
				636
				637	.. code-block:: c++
				638
				639	class LossFunction {
				640	public:
				641	virtual void Evaluate(double s, double out[3]) const = 0;
				642	};
				643
				644
				645	The key method is :func:`LossFunction::Evaluate`, which given a
				646	non-negative scalar ``s``, computes
				647
				648	.. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
				649
				650	Here the convention is that the contribution of a term to the cost
				651	function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
				652	=\\|f_i\\|^2`. Calling the method with a negative value of :math:`s`
				653	is an error and the implementations are not required to handle that
				654	case.
				655
				656	Most sane choices of :math:`\rho` satisfy:
				657
				658	.. math::
				659
				660	\rho(0) &= 0\\
				661	\rho'(0) &= 1\\
				662	\rho'(s) &< 1 \text{ in the outlier region}\\
				663	\rho''(s) &< 0 \text{ in the outlier region}
				664
				665	so that they mimic the squared cost for small residuals.
				666
				667	Scaling
				668
				669
				670	Given one robustifier :math:`\rho(s)` one can change the length
				671	scale at which robustification takes place, by adding a scale
				672	factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
				673	a^2)` and the first and second derivatives as :math:`\rho'(s /
				674	a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
				675
				676
				677	The reason for the appearance of squaring is that :math:`a` is in
				678	the units of the residual vector norm whereas :math:`s` is a squared
				679	norm. For applications it is more convenient to specify :math:`a` than
				680	its square.
				681
				682	Instances
				683	^^^^^^^^^
				684
				685	Ceres includes a number of other loss functions. For simplicity we
				686	described their unscaled versions. The figure below illustrates their
				687	shape graphically. More details can be found in
				688	``include/ceres/loss_function.h``.
				689
				690	.. figure:: loss.png
				691	:figwidth: 500px
				692	:height: 400px
				693	:align: center
				694
				695	Shape of the various common loss functions.
				696
				697	.. class:: TrivialLoss
				698
				699	.. math:: \rho(s) = s
				700
				701	.. class:: HuberLoss
				702
				703	.. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
				704
				705	.. class:: SoftLOneLoss
				706
				707	.. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
				708
				709	.. class:: CauchyLoss
				710
				711	.. math:: \rho(s) = \log(1 + s)
				712
				713	.. class:: ArctanLoss
				714
				715	.. math:: \rho(s) = \arctan(s)
				716
				717	.. class:: TolerantLoss
				718
				719	.. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
				720
				721	.. class:: ComposedLoss
				722
				723	.. class:: ScaledLoss
				724
				725	.. class:: LossFunctionWrapper
				726
				727
				728	Theory
				729	^^^^^^
				730
				731	Let us consider a problem with a single problem and a single parameter
				732	block.
				733
				734	.. math::
				735
				736	\min_x \frac{1}{2}\rho(f^2(x))
				737
				738
				739	Then, the robustified gradient and the Gauss-Newton Hessian are
				740
				741	.. math::
				742
				743	g(x) &= \rho'J^\top(x)f(x)\\
				744	H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
				745
				746	where the terms involving the second derivatives of :math:`f(x)` have
				747	been ignored. Note that :math:`H(x)` is indefinite if
				748	:math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
				749	the case, then its possible to re-weight the residual and the Jacobian
				750	matrix such that the corresponding linear least squares problem for
				751	the robustified Gauss-Newton step.
				752
				753
				754	Let :math:`\alpha` be a root of
				755
				756	.. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\\|f(x)\\|^2 = 0.
				757
				758
				759	Then, define the rescaled residual and Jacobian as
				760
				761	.. math::
				762
				763	\tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
				764	\tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
				765	\frac{f(x)f^\top(x)}{\left\\|f(x)\right\\|^2} \right)J(x)
				766
				767
				768	In the case :math:`2 \rho''\left\\|f(x)\right\\|^2 + \rho' \lesssim 0`,
				769	we limit :math:`\alpha \le 1- \epsilon` for some small
				770	:math:`\epsilon`. For more details see [Triggs]_.
				771
				772	With this simple rescaling, one can use any Jacobian based non-linear
				773	least squares algorithm to robustifed non-linear least squares
				774	problems.
				775
				776
				777	:class:`LocalParameterization`
				778	------------------------------
				779
				780	.. class:: LocalParameterization
				781
				782	.. code-block:: c++
				783
				784	class LocalParameterization {
				785	public:
				786	virtual ~LocalParameterization() {}
				787	virtual bool Plus(const double* x,
				788	const double* delta,
				789	double* x_plus_delta) const = 0;
				790	virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
				791	virtual int GlobalSize() const = 0;
				792	virtual int LocalSize() const = 0;
				793	};
				794
				795	Sometimes the parameters :math:`x` can overparameterize a
				796	problem. In that case it is desirable to choose a parameterization
				797	to remove the null directions of the cost. More generally, if
				798	:math:`x` lies on a manifold of a smaller dimension than the
				799	ambient space that it is embedded in, then it is numerically and
				800	computationally more effective to optimize it using a
				801	parameterization that lives in the tangent space of that manifold
				802	at each point.
				803
				804	For example, a sphere in three dimensions is a two dimensional
				805	manifold, embedded in a three dimensional space. At each point on
				806	the sphere, the plane tangent to it defines a two dimensional
				807	tangent space. For a cost function defined on this sphere, given a
				808	point :math:`x`, moving in the direction normal to the sphere at
				809	that point is not useful. Thus a better way to parameterize a point
				810	on a sphere is to optimize over two dimensional vector
				811	:math:`\Delta x` in the tangent space at the point on the sphere
				812	point and then "move" to the point :math:`x + \Delta x`, where the
				813	move operation involves projecting back onto the sphere. Doing so
				814	removes a redundant dimension from the optimization, making it
				815	numerically more robust and efficient.
				816
				817	More generally we can define a function
				818
				819	.. math:: x' = \boxplus(x, \Delta x),
				820
				821	where :math:`x` has the same size as :math:`x`, and :math:`\Delta
				822	x` is of size less than or equal to :math:`x`. The function
				823	:math:`\boxplus`, generalizes the definition of vector
				824	addition. Thus it satisfies the identity
				825
				826	.. math:: \boxplus(x, 0) = x,\quad \forall x.
				827
				828	Instances of :class:`LocalParameterization` implement the
				829	:math:`\boxplus` operation and its derivative with respect to
				830	:math:`\Delta x` at :math:`\Delta x = 0`.
				831
				832
				833	.. function:: int LocalParameterization::GlobalSize()
				834
				835	The dimension of the ambient space in which the parameter block
				836	:math:`x` lives.
				837
				838	.. function:: int LocalParamterization::LocaLocalSize()
				839
				840	The size of the tangent space
				841	that :math:`\Delta x` lives in.
				842
				843	.. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
				844
				845	:func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
				846
				847	.. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
				848
				849	Computes the Jacobian matrix
				850
				851	.. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right\|_{\Delta x = 0}
				852
				853	in row major form.
				854
				855	Instances
				856	^^^^^^^^^
				857
				858	.. class:: IdentityParameterization
				859
				860	A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
				861	of the same size as :math:`x` and
				862
				863	.. math:: \boxplus(x, \Delta x) = x + \Delta x
				864
				865	.. class:: SubsetParameterization
				866
				867	A more interesting case if :math:`x` is a two dimensional vector,
				868	and the user wishes to hold the first coordinate constant. Then,
				869	:math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
				870
				871	.. math::
				872
				873	\boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
				874	\end{array} \right] \Delta x
				875
				876	:class:`SubsetParameterization` generalizes this construction to
				877	hold any part of a parameter block constant.
				878
				879	.. class:: QuaternionParameterization
				880
				881	Another example that occurs commonly in Structure from Motion
				882	problems is when camera rotations are parameterized using a
				883	quaternion. There, it is useful only to make updates orthogonal to
				884	that 4-vector defining the quaternion. One way to do this is to let
				885	:math:`\Delta x` be a 3 dimensional vector and define
				886	:math:`\boxplus` to be
				887
				888	.. math:: \boxplus(x, \Delta x) = \left[ \cos(\|\Delta x\|), \frac{\sin\left(\|\Delta x\|\right)}{\|\Delta x\|} \Delta x \right] * x
				889	:label: quaternion
				890
				891	The multiplication between the two 4-vectors on the right hand side
				892	is the standard quaternion
				893	product. :class:`QuaternionParameterization` is an implementation
				894	of :eq:`quaternion`.
				895
				896
				897	:class:`Problem`
				898	----------------
				899
				900	.. class:: Problem
				901
				902	:class:`Problem` holds the robustified non-linear least squares
				903	problem :eq:`ceresproblem`. To create a least squares problem, use
				904	the :func:`Problem::AddResidualBlock` and
				905	:func:`Problem::AddParameterBlock` methods.
				906
				907	For example a problem containing 3 parameter blocks of sizes 3, 4
				908	and 5 respectively and two residual blocks of size 2 and 6:
				909
				910	.. code-block:: c++
				911
				912	double x1[] = { 1.0, 2.0, 3.0 };
				913	double x2[] = { 1.0, 2.0, 3.0, 5.0 };
				914	double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
				915
				916	Problem problem;
				917	problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
				918	problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
				919
				920	:func:`Problem::AddResidualBlock` as the name implies, adds a
				921	residual block to the problem. It adds a :class:`CostFunction` , an
				922	optional :class:`LossFunction` and connects the
				923	:class:`CostFunction` to a set of parameter block.
				924
				925	The cost function carries with it information about the sizes of
				926	the parameter blocks it expects. The function checks that these
				927	match the sizes of the parameter blocks listed in
				928	``parameter_blocks``. The program aborts if a mismatch is
				929	detected. ``loss_function`` can be ``NULL``, in which case the cost
				930	of the term is just the squared norm of the residuals.
				931
				932	The user has the option of explicitly adding the parameter blocks
				933	using :func:`Problem::AddParameterBlock`. This causes additional correctness
				934	checking; however, :func:`Problem::AddResidualBlock` implicitly adds the
				935	parameter blocks if they are not present, so calling
				936	:func:`Problem::AddParameterBlock` explicitly is not required.
				937
				938
				939	:class:`Problem` by default takes ownership of the ``cost_function`` and
				940	``loss_function`` pointers. These objects remain live for the life of
				941	the :class:`Problem` object. If the user wishes to keep control over the
				942	destruction of these objects, then they can do this by setting the
				943	corresponding enums in the ``Problem::Options`` struct.
				944
				945
				946	Note that even though the Problem takes ownership of ``cost_function``
				947	and ``loss_function``, it does not preclude the user from re-using
				948	them in another residual block. The destructor takes care to call
				949	delete on each ``cost_function`` or ``loss_function`` pointer only
				950	once, regardless of how many residual blocks refer to them.
				951
				952	:func:`Problem::AddParameterBlock` explicitly adds a parameter
				953	block to the :class:`Problem`. Optionally it allows the user to
				954	associate a :class:`LocalParameterization` object with the parameter
				955	block too. Repeated calls with the same arguments are
				956	ignored. Repeated calls with the same double pointer but a
				957	different size results in undefined behaviour.
				958
				959	You can set any parameter block to be constant using
				960	:func:`Problem::SetParameterBlockConstant` and undo this using
				961	:func:`SetParameterBlockVariable`.
				962
				963	In fact you can set any number of parameter blocks to be constant,
				964	and Ceres is smart enough to figure out what part of the problem
				965	you have constructed depends on the parameter blocks that are free
				966	to change and only spends time solving it. So for example if you
				967	constructed a problem with a million parameter blocks and 2 million
				968	residual blocks, but then set all but one parameter blocks to be
				969	constant and say only 10 residual blocks depend on this one
				970	non-constant parameter block. Then the computational effort Ceres
				971	spends in solving this problem will be the same if you had defined
				972	a problem with one parameter block and 10 residual blocks.
				973
				974	Ownership
				975
				976	:class:`Problem` by default takes ownership of the
				977	``cost_function``, ``loss_function`` and ``local_parameterization``
				978	pointers. These objects remain live for the life of the
				979	:class:`Problem`. If the user wishes to keep control over the
				980	destruction of these objects, then they can do this by setting the
				981	corresponding enums in the :class:`Problem::Options` struct.
				982
				983	Even though :class:`Problem` takes ownership of these pointers, it
				984	does not preclude the user from re-using them in another residual
				985	or parameter block. The destructor takes care to call delete on
				986	each pointer only once.
				987
				988
				989	.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
				990
				991
				992	.. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
				993	void Problem::AddParameterBlock(double* values, int size)
				994
				995
				996	.. function:: void Problem::SetParameterBlockConstant(double* values)
				997
				998	.. function:: void Problem::SetParameterBlockVariable(double* values)
				999
				1000
				1001	.. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
				1002
				1003
				1004	.. function:: int Problem::NumParameterBlocks() const
				1005
				1006
				1007	.. function:: int Problem::NumParameters() const
				1008
				1009
				1010	.. function:: int Problem::NumResidualBlocks() const
				1011
				1012
				1013	.. function:: int Problem::NumResiduals() const
				1014
				1015
				1016
				1017	``rotation.h``
				1018	--------------
				1019
				1020	Many applications of Ceres Solver involve optimization problems where
				1021	some of the variables correspond to rotations. To ease the pain of
				1022	work with the various representations of rotations (angle-axis,
				1023	quaternion and matrix) we provide a handy set of templated
				1024	functions. These functions are templated so that the user can use them
				1025	within Ceres Solver's automatic differentiation framework.
				1026
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1027	.. function:: void AngleAxisToQuaternion<T>(T const* angle_axis, T* quaternion)
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1028
				1029	Convert a value in combined axis-angle representation to a
				1030	quaternion.
				1031
				1032	The value ``angle_axis`` is a triple whose norm is an angle in radians,
				1033	and whose direction is aligned with the axis of rotation, and
				1034	``quaternion`` is a 4-tuple that will contain the resulting quaternion.
				1035
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1036	.. function:: void QuaternionToAngleAxis<T>(T const* quaternion, T* angle_axis)
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1037
				1038	Convert a quaternion to the equivalent combined axis-angle
				1039	representation.
				1040
				1041	The value ``quaternion`` must be a unit quaternion - it is not
				1042	normalized first, and ``angle_axis`` will be filled with a value
				1043	whose norm is the angle of rotation in radians, and whose direction
				1044	is the axis of rotation.
				1045
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1046	.. function:: void RotationMatrixToAngleAxis<T>(T const * R, T * angle_axis)
				1047	.. function:: void AngleAxisToRotationMatrix<T>(T const * angle_axis, T * R)
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1048
				1049	Conversions between 3x3 rotation matrix (in column major order) and
				1050	axis-angle rotation representations.
				1051
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1052	.. function:: void EulerAnglesToRotationMatrix<T>(const T* euler, int row_stride, T* R)
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1053
				1054	Conversions between 3x3 rotation matrix (in row major order) and
				1055	Euler angle (in degrees) rotation representations.
				1056
				1057	The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
				1058	axes, respectively. They are applied in that same order, so the
				1059	total rotation R is Rz * Ry * Rx.
				1060
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1061	.. function:: void QuaternionToScaledRotation<T>(const T q[4], T R[3 * 3])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1062
				1063	Convert a 4-vector to a 3x3 scaled rotation matrix.
				1064
				1065	The choice of rotation is such that the quaternion
				1066	:math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
				1067	matrix and for small :math:`a, b, c` the quaternion
				1068	:math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
				1069
				1070	.. math::
				1071
				1072	I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
				1073	\end{bmatrix} + O(q^2)
				1074
				1075	which corresponds to a Rodrigues approximation, the last matrix
				1076	being the cross-product matrix of :math:`\begin{bmatrix} a& b&
				1077	c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
				1078	= R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
				1079	:math:`R`.
				1080
				1081	The rotation matrix ``R`` is row-major.
				1082
				1083	No normalization of the quaternion is performed, i.e.
				1084	:math:`R = \\|q\\|^2 Q`, where :math:`Q` is an orthonormal matrix
				1085	such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
				1086
				1087
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1088	.. function:: void QuaternionToRotation<T>(const T q[4], T R[3 * 3])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1089
				1090	Same as above except that the rotation matrix is normalized by the
				1091	Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
				1092
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1093	.. function:: void UnitQuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1094
				1095	Rotates a point pt by a quaternion q:
				1096
				1097	.. math:: \text{result} = R(q) \text{pt}
				1098
				1099	Assumes the quaternion is unit norm. If you pass in a quaternion
				1100	with :math:`\|q\|^2 = 2` then you WILL NOT get back 2 times the
				1101	result you get for a unit quaternion.
				1102
				1103
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1104	.. function:: void QuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1105
				1106	With this function you do not need to assume that q has unit norm.
				1107	It does assume that the norm is non-zero.
				1108
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1109	.. function:: void QuaternionProduct<T>(const T z[4], const T w[4], T zw[4])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1110
				1111	.. math:: zw = z * w
				1112
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1113	where :math:`*` is the Quaternion product between 4-vectors.
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1114
				1115
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1116	.. function:: void CrossProduct<T>(const T x[3], const T y[3], T x_cross_y[3])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1117
				1118	.. math:: \text{x_cross_y} = x \times y
				1119
Sameer Agarwal	08c891f	2013-02-04 20:18:58 -0800	[diff] [blame]	1120	.. function:: void AngleAxisRotatePoint<T>(const T angle_axis[3], const T pt[3], T result[3])
Sameer Agarwal	3d87b72	2013-02-02 00:49:31 -0800	[diff] [blame]	1121
				1122	.. math:: y = R(\text{angle_axis}) x