Miscellaneous CUDA related changes. 1. Fix a stupid error in types.cc 2. Update documentation for Solver::Options::dense_linear_algebra_library_type 3. Add a note to installation.rst to update the installation docs. 4. Mention GPU acceleration in features.rst Change-Id: Id63202ff090e23bbb211d2ee458559fb8046281d
diff --git a/docs/source/features.rst b/docs/source/features.rst index 634be9d..956eb00 100644 --- a/docs/source/features.rst +++ b/docs/source/features.rst
@@ -46,10 +46,11 @@ computational cost in all of these methods is the solution of a linear system. To this end Ceres ships with a variety of linear solvers - dense QR and dense Cholesky factorization (using - `Eigen`_ or `LAPACK`_) for dense problems, sparse Cholesky - factorization (`SuiteSparse`_, `CXSparse`_ or `Eigen`_) for large - sparse problems, custom Schur complement based dense, sparse, and - iterative linear solvers for `bundle adjustment`_ problems. + `Eigen`_, `LAPACK`_ or `CUDA`_) for dense problems, sparse + Cholesky factorization (`SuiteSparse`_, `Apple's Accelerate`_, + `CXSparse`_ `Eigen`_) for large sparse problems, custom Schur + complement based dense, sparse, and iterative linear solvers for + `bundle adjustment`_ problems. - **Line Search Solvers** - When the problem size is so large that storing and factoring the Jacobian is not feasible or a low @@ -62,6 +63,9 @@ modern C++ threads based multithreading of the Jacobian evaluation and the linear solvers. +* **GPU Acceleration** If your system supports `CUDA`_ then Ceres + Solver can use the Nvidia GPU on your system to speed up the solver. + * **Solution Quality** Ceres is the `best performing`_ solver on the NIST problem set used by Mondragon and Borchers for benchmarking non-linear least squares solvers. @@ -89,3 +93,5 @@ .. _CXSparse: https://www.cise.ufl.edu/research/sparse/CXSparse/ .. _automatic: http://en.wikipedia.org/wiki/Automatic_differentiation .. _numeric: http://en.wikipedia.org/wiki/Numerical_differentiation +.. _CUDA : https://developer.nvidia.com/cuda-toolkit +.. _Apple's Accelerate: https://developer.apple.com/documentation/accelerate/sparse_solvers
diff --git a/docs/source/installation.rst b/docs/source/installation.rst index a924d95..3591901 100644 --- a/docs/source/installation.rst +++ b/docs/source/installation.rst
@@ -104,6 +104,11 @@ ``SuiteSparse``, and optionally used by Ceres directly for some operations. + TODO:: + + 1. Add a more detailed note about Intel MKL. + 2. Add detailed instructions about CUDA + On ``UNIX`` OSes other than macOS we recommend `ATLAS <http://math-atlas.sourceforge.net/>`_, which includes ``BLAS`` and ``LAPACK`` routines. It is also possible to use `OpenBLAS
diff --git a/docs/source/nnls_solving.rst b/docs/source/nnls_solving.rst index 9946645..a3225db 100644 --- a/docs/source/nnls_solving.rst +++ b/docs/source/nnls_solving.rst
@@ -1325,16 +1325,17 @@ Default:``EIGEN`` Ceres supports using multiple dense linear algebra libraries for - dense matrix factorizations. Currently ``EIGEN`` and ``LAPACK`` are - the valid choices. ``EIGEN`` is always available, ``LAPACK`` refers - to the system ``BLAS + LAPACK`` library which may or may not be - available. + dense matrix factorizations. Currently ``EIGEN``, ``LAPACK`` and + ``CUDA`` are the valid choices. ``EIGEN`` is always available, + ``LAPACK`` refers to the system ``BLAS + LAPACK`` library which may + or may not be available. ``CUDA`` refers to Nvidia's GPU based + dense linear algebra library which may or may not be available. This setting affects the ``DENSE_QR``, ``DENSE_NORMAL_CHOLESKY`` and ``DENSE_SCHUR`` solvers. For small to moderate sized probem ``EIGEN`` is a fine choice but for large problems, an optimized - ``LAPACK + BLAS`` implementation can make a substantial difference - in performance. + ``LAPACK + BLAS`` or ``CUDA`` implementation can make a substantial + difference in performance. .. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library_type
diff --git a/include/ceres/solver.h b/include/ceres/solver.h index 35644c4..026fc1c0 100644 --- a/include/ceres/solver.h +++ b/include/ceres/solver.h
@@ -364,23 +364,23 @@ std::unordered_set<ResidualBlockId> residual_blocks_for_subset_preconditioner; - // Ceres supports using multiple dense linear algebra libraries - // for dense matrix factorizations. Currently EIGEN and LAPACK are - // the valid choices. EIGEN is always available, LAPACK refers to - // the system BLAS + LAPACK library which may or may not be + // Ceres supports using multiple dense linear algebra libraries for dense + // matrix factorizations. Currently EIGEN, LAPACK and CUDA are the valid + // choices. EIGEN is always available, LAPACK refers to the system BLAS + + // LAPACK library which may or may not be available. CUDA refers to Nvidia's + // GPU based dense linear algebra library, which may or may not be // available. // - // This setting affects the DENSE_QR, DENSE_NORMAL_CHOLESKY and - // DENSE_SCHUR solvers. For small to moderate sized problem EIGEN - // is a fine choice but for large problems, an optimized LAPACK + - // BLAS implementation can make a substantial difference in - // performance. + // This setting affects the DENSE_QR, DENSE_NORMAL_CHOLESKY and DENSE_SCHUR + // solvers. For small to moderate sized problem EIGEN is a fine choice but + // for large problems, an optimized LAPACK + BLAS or CUDA implementation can + // make a substantial difference in performance. DenseLinearAlgebraLibraryType dense_linear_algebra_library_type = EIGEN; - // Ceres supports using multiple sparse linear algebra libraries - // for sparse matrix ordering and factorizations. Currently, - // SUITE_SPARSE and CX_SPARSE are the valid choices, depending on - // whether they are linked into Ceres at build time. + // Ceres supports using multiple sparse linear algebra libraries for sparse + // matrix ordering and factorizations. Currently, SUITE_SPARSE and CX_SPARSE + // are the valid choices, depending on whether they are linked into Ceres at + // build time. SparseLinearAlgebraLibraryType sparse_linear_algebra_library_type = #if !defined(CERES_NO_SUITESPARSE) SUITE_SPARSE;
diff --git a/internal/ceres/types.cc b/internal/ceres/types.cc index 8cfbc77..4824267 100644 --- a/internal/ceres/types.cc +++ b/internal/ceres/types.cc
@@ -433,7 +433,7 @@ #ifdef CERES_NO_CUDA return false; #else - return false; + return true; #endif }