Blame - docs/modeling.tex - ceres-solver

blob: e6dce1f4a61bf5822711e5aee3f619c4574e5118 [file] [log] [blame]

Sameer Agarwal	8ed29a7	2012-06-07 17:04:25 -0700	[diff] [blame]	1	%!TEX root = ceres-solver.tex
Sameer Agarwal	d3eaa48	2012-05-29 23:47:57 -0700	[diff] [blame]	2	\chapter{Modeling}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	3	\label{chapter:api}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	4	\section{\texttt{CostFunction}}
Sameer Agarwal	c811d01	2012-11-01 10:22:11 -0700	[diff] [blame]	5	Given parameter blocks $\left[x_{i_1}, \hdots , x_{i_k}\right]$, a
				6	\texttt{CostFunction} is responsible for computing
				7	a vector of residuals and if asked a vector of Jacobian matrices, i.e., given $\left[x_{i_1}, \hdots , x_{i_k}\right]$, compute the vector $f_i\left(x_{i_1},\hdots,x_{i_k}\right)$ and the matrices
				8
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	9	\begin{equation}
Sameer Agarwal	c811d01	2012-11-01 10:22:11 -0700	[diff] [blame]	10	J_{ij} = \frac{\partial}{\partial x_{j}}f_i\left(x_{i_1},\hdots,x_{i_k}\right),\quad \forall j \in \{i_1,\hdots, i_k\}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	11	\end{equation}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	12	\begin{minted}{c++}
				13	class CostFunction {
				14	public:
				15	virtual bool Evaluate(double const* const* parameters,
				16	double* residuals,
				17	double** jacobians) = 0;
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	18	const vector<int16>& parameter_block_sizes();
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	19	int num_residuals() const;
				20
				21	protected:
				22	vector<int16>* mutable_parameter_block_sizes();
				23	void set_num_residuals(int num_residuals);
				24	};
				25	\end{minted}
				26
				27	The signature of the function (number and sizes of input parameter blocks and number of outputs)
				28	is stored in \texttt{parameter\_block\_sizes\_} and \texttt{num\_residuals\_} respectively. User
				29	code inheriting from this class is expected to set these two members with the
				30	corresponding accessors. This information will be verified by the Problem
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	31	when added with \texttt{Problem::AddResidualBlock}.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	32
				33	The most important method here is \texttt{Evaluate}. It implements the residual and Jacobian computation.
				34
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	35	\texttt{parameters} is an array of pointers to arrays containing the various parameter blocks. parameters has the same number of elements as parameter\_block\_sizes\_. Parameter blocks are in the same order as parameter\_block\_sizes\_.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	36
				37
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	38	\texttt{residuals} is an array of size \texttt{num\_residuals\_}.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	39
				40
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	41	\texttt{jacobians} is an array of size \texttt{parameter\_block\_sizes\_} containing pointers to storage for Jacobian matrices corresponding to each parameter block. The Jacobian matrices are in the same order as \texttt{parameter\_block\_sizes\_}. \texttt{jacobians[i]} is an array that contains \texttt{num\_residuals\_} $\times$ \texttt{parameter\_block\_sizes\_[i]} elements. Each Jacobian matrix is stored in row-major order, i.e.,
				42
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	43	\begin{equation}
Sameer Agarwal	d3eaa48	2012-05-29 23:47:57 -0700	[diff] [blame]	44	\texttt{jacobians[i][r * parameter\_block\_size\_[i] + c]} =
				45	%\frac{\partial}{\partial x_{ic}} f_{r}\left(x_{1},\hdots, x_{k}\right)
				46	\frac{\partial \texttt{residual[r]}}{\partial \texttt{parameters[i][c]}}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	47	\end{equation}
				48
				49	If \texttt{jacobians} is \texttt{NULL}, then no derivatives are returned; this is the case when computing cost only. If \texttt{jacobians[i]} is \texttt{NULL}, then the Jacobian matrix corresponding to the $i^{\textrm{th}}$ parameter block must not be returned, this is the case when the a parameter block is marked constant.
				50
				51	\section{\texttt{SizedCostFunction}}
				52	If the size of the parameter blocks and the size of the residual vector is known at compile time (this is the common case), Ceres provides \texttt{SizedCostFunction}, where these values can be specified as template parameters.
				53	\begin{minted}{c++}
				54	template<int kNumResiduals,
				55	int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0, int N5 = 0>
				56	class SizedCostFunction : public CostFunction {
				57	public:
				58	virtual bool Evaluate(double const* const* parameters,
				59	double* residuals,
				60	double** jacobians) = 0;
				61	};
				62	\end{minted}
				63	In this case the user only needs to implement the \texttt{Evaluate} method.
				64
				65	\section{\texttt{AutoDiffCostFunction}}
				66	But even defining the \texttt{SizedCostFunction} can be a tedious affair if complicated derivative computations are involved. To this end Ceres provides automatic differentiation.
				67
				68	To get an auto differentiated cost function, you must define a class with a
				69	templated \texttt{operator()} (a functor) that computes the cost function in terms of
				70	the template parameter \texttt{T}. The autodiff framework substitutes appropriate
				71	\texttt{Jet} objects for T in order to compute the derivative when necessary, but
				72	this is hidden, and you should write the function as if T were a scalar type
				73	(e.g. a double-precision floating point number).
				74
				75	The function must write the computed value in the last argument (the only
				76	non-\texttt{const} one) and return true to indicate success.
				77
				78	For example, consider a scalar error $e = k - x^\top y$, where both $x$ and $y$ are
				79	two-dimensional vector parameters and $k$ is a constant. The form of this error, which is the
				80	difference between a constant and an expression, is a common pattern in least
				81	squares problems. For example, the value $x^\top y$ might be the model expectation
				82	for a series of measurements, where there is an instance of the cost function
				83	for each measurement $k$.
				84
				85	The actual cost added to the total problem is $e^2$, or $(k - x^\top y)^2$; however,
				86	the squaring is implicitly done by the optimization framework.
				87
				88	To write an auto-differentiable cost function for the above model, first
				89	define the object
				90	\begin{minted}{c++}
				91	class MyScalarCostFunction {
				92	MyScalarCostFunction(double k): k_(k) {}
				93	template <typename T>
				94	bool operator()(const T* const x , const T* const y, T* e) const {
Sameer Agarwal	c811d01	2012-11-01 10:22:11 -0700	[diff] [blame]	95	e[0] = T(k_) - x[0] * y[0] - x[1] * y[1];
				96	return true;
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	97	}
				98
				99	private:
				100	double k_;
				101	};
				102	\end{minted}
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	103
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	104	Note that in the declaration of \texttt{operator()} the input parameters \texttt{x} and \texttt{y} come
				105	first, and are passed as const pointers to arrays of \texttt{T}. If there were three
				106	input parameters, then the third input parameter would come after \texttt{y}. The
				107	output is always the last parameter, and is also a pointer to an array. In
				108	the example above, \texttt{e} is a scalar, so only \texttt{e[0]} is set.
				109
				110	Then given this class definition, the auto differentiated cost function for
				111	it can be constructed as follows.
				112
				113	\begin{minted}{c++}
				114	CostFunction* cost_function
				115	= new AutoDiffCostFunction<MyScalarCostFunction, 1, 2, 2>(
				116	new MyScalarCostFunction(1.0)); ^ ^ ^
				117	\| \| \|
				118	Dimension of residual ------+ \| \|
				119	Dimension of x ----------------+ \|
				120	Dimension of y -------------------+
				121	\end{minted}
				122
Sameer Agarwal	7646039	2012-05-06 22:02:19 -0700	[diff] [blame]	123	In this example, there is usually an instance for each measurement of k.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	124
				125	In the instantiation above, the template parameters following
				126	\texttt{MyScalarCostFunction}, \texttt{<1, 2, 2>} describe the functor as computing a
				127	1-dimensional output from two arguments, both 2-dimensional.
				128
				129	The framework can currently accommodate cost functions of up to 6 independent
				130	variables, and there is no limit on the dimensionality of each of them.
				131
				132	\textbf{WARNING 1} Since the functor will get instantiated with different types for
				133	\texttt{T}, you must convert from other numeric types to \texttt{T} before mixing
				134	computations with other variables of type \texttt{T}. In the example above, this is
				135	seen where instead of using \texttt{k\_} directly, \texttt{k\_} is wrapped with \texttt{T(k\_)}.
				136
				137	\textbf{WARNING 2} A common beginner's error when first using \texttt{AutoDiffCostFunction} is to get the sizing wrong. In particular, there is a tendency to
				138	set the template parameters to (dimension of residual, number of parameters)
				139	instead of passing a dimension parameter for {\em every parameter block}. In the
				140	example above, that would be \texttt{<MyScalarCostFunction, 1, 2>}, which is missing
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	141	the 2 as the last template argument.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	142
Sameer Agarwal	d3eaa48	2012-05-29 23:47:57 -0700	[diff] [blame]	143	\subsection{Theory \& Implementation}
				144	TBD
				145
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	146	\section{\texttt{NumericDiffCostFunction}}
				147	To get a numerically differentiated cost function, define a subclass of
				148	\texttt{CostFunction} such that the \texttt{Evaluate} function ignores the jacobian
				149	parameter. The numeric differentiation wrapper will fill in the jacobians array
Sameer Agarwal	7646039	2012-05-06 22:02:19 -0700	[diff] [blame]	150	if necessary by repeatedly calling the \texttt{Evaluate} method with
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	151	small changes to the appropriate parameters, and computing the slope. For
				152	performance, the numeric differentiation wrapper class is templated on the
				153	concrete cost function, even though it could be implemented only in terms of
				154	the virtual \texttt{CostFunction} interface.
				155	\begin{minted}{c++}
				156	template <typename CostFunctionNoJacobian,
				157	NumericDiffMethod method = CENTRAL, int M = 0,
				158	int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0, int N5 = 0>
				159	class NumericDiffCostFunction
				160	: public SizedCostFunction<M, N0, N1, N2, N3, N4, N5> {
				161	};
				162	\end{minted}
				163
				164	The numerically differentiated version of a cost function for a cost function
				165	can be constructed as follows:
				166	\begin{minted}{c++}
				167	CostFunction* cost_function
				168	= new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
				169	new MyCostFunction(...), TAKE_OWNERSHIP);
				170	\end{minted}
				171	where \texttt{MyCostFunction} has 1 residual and 2 parameter blocks with sizes 4 and 8
				172	respectively. Look at the tests for a more detailed example.
				173
				174	The central difference method is considerably more accurate at the cost of
				175	twice as many function evaluations than forward difference. Consider using
				176	central differences begin with, and only after that works, trying forward
				177	difference to improve performance.
				178
				179	\section{\texttt{LossFunction}}
				180	For least squares problems where the minimization may encounter
				181	input terms that contain outliers, that is, completely bogus
				182	measurements, it is important to use a loss function that reduces
				183	their influence.
				184
				185	Consider a structure from motion problem. The unknowns are 3D
				186	points and camera parameters, and the measurements are image
				187	coordinates describing the expected reprojected position for a
				188	point in a camera. For example, we want to model the geometry of a
				189	street scene with fire hydrants and cars, observed by a moving
				190	camera with unknown parameters, and the only 3D points we care
				191	about are the pointy tippy-tops of the fire hydrants. Our magic
				192	image processing algorithm, which is responsible for producing the
				193	measurements that are input to Ceres, has found and matched all
				194	such tippy-tops in all image frames, except that in one of the
				195	frame it mistook a car's headlight for a hydrant. If we didn't do
				196	anything special the
				197	residual for the erroneous measurement will result in the
				198	entire solution getting pulled away from the optimum to reduce
				199	the large error that would otherwise be attributed to the wrong
				200	measurement.
				201
				202	Using a robust loss function, the cost for large residuals is
				203	reduced. In the example above, this leads to outlier terms getting
				204	down-weighted so they do not overly influence the final solution.
				205
				206	\begin{minted}{c++}
				207	class LossFunction {
				208	public:
				209	virtual void Evaluate(double s, double out[3]) const = 0;
				210	};
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	211	\end{minted}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	212
				213	The key method is \texttt{Evaluate}, which given a non-negative scalar \texttt{s}, computes
				214	\begin{align}
				215	\texttt{out} = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
				216	\end{align}
				217
				218	Here the convention is that the contribution of a term to the cost function is given by $\frac{1}{2}\rho(s)$, where $s = \\|f_i\\|^2$. Calling the method with a negative value of $s$ is an error and the implementations are not required to handle that case.
				219
				220	Most sane choices of $\rho$ satisfy:
				221	\begin{align}
				222	\rho(0) &= 0\\
				223	\rho'(0) &= 1\\
				224	\rho'(s) &< 1 \text{ in the outlier region}\\
				225	\rho''(s) &< 0 \text{ in the outlier region}
				226	\end{align}
				227	so that they mimic the squared cost for small residuals.
				228
				229	\subsection{Scaling}
				230	Given one robustifier $\rho(s)$
				231	one can change the length scale at which robustification takes
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	232	place, by adding a scale factor $a > 0$ which gives us $\rho(s,a) = a^2 \rho(s / a^2)$ and the first and second derivatives as $\rho'(s / a^2)$ and $(1 / a^2) \rho''(s / a^2)$ respectively.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	233
				234
				235	\begin{figure}[hbt]
				236	\includegraphics[width=\textwidth]{loss.pdf}
				237	\caption{Shape of the various common loss functions.}
				238	\label{fig:loss}
				239	\end{figure}
				240
				241
				242	The reason for the appearance of squaring is that $a$ is in the units of the residual vector norm whereas $s$ is a squared norm. For applications it is more convenient to specify $a$ than
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	243	its square.
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	244
				245	Here are some common loss functions implemented in Ceres. For simplicity we described their unscaled versions. Figure~\ref{fig:loss} illustrates their shape graphically.
				246
				247	\begin{align}
				248	\rho(s)&=s \tag{\texttt{NullLoss}}\\
				249	\rho(s) &= \begin{cases}
				250	s & s \le 1\\
				251	2 \sqrt{s} - 1 & s > 1
				252	\end{cases} \tag{\texttt{HuberLoss}}\\
				253	\rho(s) &= 2 (\sqrt{1+s} - 1) \tag{\texttt{SoftLOneLoss}}\\
				254	\rho(s) &= \log(1 + s) \tag{\texttt{CauchyLoss}}
				255	\end{align}
				256
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	257	Ceres includes a number of other loss functions, the descriptions and
				258	documentation for which can be found in \texttt{loss\_function.h}.
				259
Sameer Agarwal	d3eaa48	2012-05-29 23:47:57 -0700	[diff] [blame]	260	\subsection{Theory \& Implementation}
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	261	Let us consider a problem with a single problem and a single parameter
				262	block.
				263	\begin{align}
				264	\min_x \frac{1}{2}\rho(f^2(x))
				265	\end{align}
				266
				267	Then, the robustified gradient and the Gauss-Newton Hessian are
				268	\begin{align}
				269	g(x) &= \rho'J^\top(x)f(x)\\
Sameer Agarwal	c811d01	2012-11-01 10:22:11 -0700	[diff] [blame]	270	H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	271	\end{align}
Sameer Agarwal	c811d01	2012-11-01 10:22:11 -0700	[diff] [blame]	272	where the terms involving the second derivatives of $f(x)$ have been ignored. Note that $H(x)$ is indefinite if $\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0$. If this is not the case, then its possible to re-weight the residual and the Jacobian matrix such that the corresponding linear least squares problem for the robustified Gauss-Newton step.
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	273
				274
Sameer Agarwal	c811d01	2012-11-01 10:22:11 -0700	[diff] [blame]	275	Let $\alpha$ be a root of
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	276	\begin{equation}
				277	\frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\\|f(x)\\|^2 = 0.
				278	\end{equation}
				279	Then, define the rescaled residual and Jacobian as
				280	\begin{align}
				281	\tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
				282	\tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha \frac{f(x)f^\top(x)}{\left\\|f(x)\right\\|^2} \right)J(x)
				283	\end{align}
				284	In the case $2 \rho''\left\\|f(x)\right\\|^2 + \rho' \lesssim 0$, we limit $\alpha \le 1- \epsilon$ for some small $\epsilon$. For more details see Triggs et al~\cite{triggs-etal-1999}.
				285
				286	With this simple rescaling, one can use any Jacobian based non-linear least squares algorithm to robustifed non-linear least squares problems.
Sameer Agarwal	d3eaa48	2012-05-29 23:47:57 -0700	[diff] [blame]	287
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	288	\section{\texttt{LocalParameterization}}
				289	Sometimes the parameters $x$ can overparameterize a problem. In
				290	that case it is desirable to choose a parameterization to remove
				291	the null directions of the cost. More generally, if $x$ lies on a
				292	manifold of a smaller dimension than the ambient space that it is
				293	embedded in, then it is numerically and computationally more
				294	effective to optimize it using a parameterization that lives in
				295	the tangent space of that manifold at each point.
				296
				297	For example, a sphere in three dimensions is a two dimensional
				298	manifold, embedded in a three dimensional space. At each point on
				299	the sphere, the plane tangent to it defines a two dimensional
				300	tangent space. For a cost function defined on this sphere, given a
				301	point $x$, moving in the direction normal to the sphere at that
				302	point is not useful. Thus a better way to parameterize a point on
				303	a sphere is to optimize over two dimensional vector $\Delta x$ in the
				304	tangent space at the point on the sphere point and then "move" to
				305	the point $x + \Delta x$, where the move operation involves projecting
				306	back onto the sphere. Doing so removes a redundant dimension from
				307	the optimization, making it numerically more robust and efficient.
				308
				309	More generally we can define a function
				310	\begin{equation}
				311	x' = \boxplus(x, \Delta x),
				312	\end{equation}
				313	where $x'$ has the same size as $x$, and $\Delta x$ is of size less
				314	than or equal to $x$. The function $\boxplus$, generalizes the
				315	definition of vector addition. Thus it satisfies the identity
				316	\begin{equation}
				317	\boxplus(x, 0) = x,\quad \forall x.
				318	\end{equation}
				319
				320	Instances of \texttt{LocalParameterization} implement the $\boxplus$ operation and its derivative with respect to $\Delta x$ at $\Delta x = 0$.
				321
				322	\begin{minted}{c++}
				323	class LocalParameterization {
				324	public:
				325	virtual ~LocalParameterization() {}
				326	virtual bool Plus(const double* x,
				327	const double* delta,
				328	double* x_plus_delta) const = 0;
				329	virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
				330	virtual int GlobalSize() const = 0;
				331	virtual int LocalSize() const = 0;
				332	};
				333	\end{minted}
				334
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	335	\texttt{GlobalSize} is the dimension of the ambient space in which the parameter block $x$ lives. \texttt{LocalSize} is the size of the tangent space that $\Delta x$ lives in. \texttt{Plus} implements $\boxplus(x,\Delta x)$ and $\texttt{ComputeJacobian}$ computes the Jacobian matrix
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	336	\begin{equation}
				337	J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right\|_{\Delta x = 0}
				338	\end{equation}
				339	in row major form.
				340
				341	A trivial version of $\boxplus$ is when delta is of the same size as $x$
				342	and
				343
				344	\begin{equation}
				345	\boxplus(x, \Delta x) = x + \Delta x
				346	\end{equation}
				347
				348	A more interesting case if $x$ is a two dimensional vector, and the
				349	user wishes to hold the first coordinate constant. Then, $\Delta x$ is a
				350	scalar and $\boxplus$ is defined as
				351
				352	\begin{equation}
				353	\boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
				354	\end{array} \right] \Delta x
				355	\end{equation}
				356
				357	\texttt{SubsetParameterization} generalizes this construction to hold any part of a parameter block constant.
				358
				359
				360	Another example that occurs commonly in Structure from Motion problems
				361	is when camera rotations are parameterized using a quaternion. There,
				362	it is useful only to make updates orthogonal to that 4-vector defining
				363	the quaternion. One way to do this is to let $\Delta x$ be a 3
				364	dimensional vector and define $\boxplus$ to be
				365
				366	\begin{equation}
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	367	\boxplus(x, \Delta x) =
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	368	\left[
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	369	\cos(\|\Delta x\|), \frac{\sin\left(\|\Delta x\|\right)}{\|\Delta x\|} \Delta x
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	370	\right] * x
				371	\label{eq:quaternion}
				372	\end{equation}
				373	The multiplication between the two 4-vectors on the right hand
				374	side is the standard quaternion product. \texttt{QuaternionParameterization} is an implementation of~\eqref{eq:quaternion}.
				375
Sameer Agarwal	d3eaa48	2012-05-29 23:47:57 -0700	[diff] [blame]	376	\clearpage
				377
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	378	\section{\texttt{Problem}}
				379	\begin{minted}{c++}
				380	class Problem {
				381	public:
				382	struct Options {
				383	Options();
				384	Ownership cost_function_ownership;
				385	Ownership loss_function_ownership;
				386	Ownership local_parameterization_ownership;
				387	};
				388
				389	Problem();
				390	explicit Problem(const Options& options);
				391	~Problem();
				392
				393	ResidualBlockId AddResidualBlock(CostFunction* cost_function,
				394	LossFunction* loss_function,
				395	const vector<double*>& parameter_blocks);
				396
				397	void AddParameterBlock(double* values, int size);
				398	void AddParameterBlock(double* values,
				399	int size,
				400	LocalParameterization* local_parameterization);
				401
				402	void SetParameterBlockConstant(double* values);
				403	void SetParameterBlockVariable(double* values);
				404	void SetParameterization(double* values,
				405	LocalParameterization* local_parameterization);
				406
				407	int NumParameterBlocks() const;
				408	int NumParameters() const;
				409	int NumResidualBlocks() const;
				410	int NumResiduals() const;
				411	};
				412	\end{minted}
				413
				414	The \texttt{Problem} objects holds the robustified non-linear least squares problem~\eqref{eq:ceresproblem}. To create a least squares problem, use the \texttt{Problem::AddResidualBlock} and \texttt{Problem::AddParameterBlock} methods.
				415
				416	For example a problem containing 3 parameter blocks of sizes 3, 4 and 5
				417	respectively and two residual blocks of size 2 and 6:
				418
				419	\begin{minted}{c++}
				420	double x1[] = { 1.0, 2.0, 3.0 };
				421	double x2[] = { 1.0, 2.0, 3.0, 5.0 };
				422	double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
				423
				424	Problem problem;
				425	problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
				426	problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
				427	\end{minted}
				428
				429
				430	\texttt{AddResidualBlock} as the name implies, adds a residual block to the problem. It adds a cost function, an optional loss function, and connects the cost function to a set of parameter blocks.
				431
				432	The cost
				433	function carries with it information about the sizes of the
				434	parameter blocks it expects. The function checks that these match
				435	the sizes of the parameter blocks listed in \texttt{parameter\_blocks}. The
				436	program aborts if a mismatch is detected. \texttt{loss\_function} can be
				437	\texttt{NULL}, in which case the cost of the term is just the squared norm
				438	of the residuals.
				439
				440	The user has the option of explicitly adding the parameter blocks
				441	using \texttt{AddParameterBlock}. This causes additional correctness
				442	checking; however, \texttt{AddResidualBlock} implicitly adds the parameter
				443	blocks if they are not present, so calling \texttt{AddParameterBlock}
				444	explicitly is not required.
				445
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	446
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	447	\texttt{Problem} by default takes ownership of the
				448	\texttt{cost\_function} and \texttt{loss\_function pointers}. These objects remain
				449	live for the life of the \texttt{Problem} object. If the user wishes to
				450	keep control over the destruction of these objects, then they can
				451	do this by setting the corresponding enums in the \texttt{Options} struct.
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	452
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	453
				454	Note that even though the Problem takes ownership of \texttt{cost\_function}
				455	and \texttt{loss\_function}, it does not preclude the user from re-using
				456	them in another residual block. The destructor takes care to call
				457	delete on each \texttt{cost\_function} or \texttt{loss\_function} pointer only once,
				458	regardless of how many residual blocks refer to them.
				459
				460	\texttt{AddParameterBlock} explicitly adds a parameter block to the \texttt{Problem}. Optionally it allows the user to associate a LocalParameterization object with the parameter block too. Repeated calls with the same arguments are ignored. Repeated
				461	calls with the same double pointer but a different size results in undefined behaviour.
				462
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	463	You can set any parameter block to be constant using
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	464
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	465	\texttt{Problem::SetParameterBlockConstant}
Keir Mierle	8ebb073	2012-04-30 23:09:08 -0700	[diff] [blame]	466
				467	and undo this using
				468
				469	\texttt{Problem::SetParameterBlockVariable}.
				470
				471	In fact you can set any number of parameter blocks to be constant, and Ceres is smart enough to figure out what part of the problem you have constructed depends on the parameter blocks that are free to change and only spends time solving it. So for example if you constructed a problem with a million parameter blocks and 2 million residual blocks, but then set all but one parameter blocks to be constant and say only 10 residual blocks depend on this one non-constant parameter block. Then the computational effort Ceres spends in solving this problem will be the same if you had defined a problem with one parameter block and 10 residual blocks.
				472
				473	\texttt{Problem} by default takes ownership of the
				474	\texttt{cost\_function}, \texttt{loss\_function} and \\ \texttt{local\_parameterization} pointers. These objects remain
				475	live for the life of the \texttt{Problem} object. If the user wishes to
				476	keep control over the destruction of these objects, then they can
				477	do this by setting the corresponding enums in the \texttt{Options} struct. Even though \texttt{Problem} takes ownership of these pointers, it does not preclude the user from re-using them in another residual or parameter block. The destructor takes care to call
Sameer Agarwal	122cf83	2012-08-24 16:28:27 -0700	[diff] [blame]	478	delete on each pointer only once.