Miscellaneous CUDA related changes. 1. Fix a stupid error in types.cc 2. Update documentation for Solver::Options::dense_linear_algebra_library_type 3. Add a note to installation.rst to update the installation docs. 4. Mention GPU acceleration in features.rst Change-Id: Id63202ff090e23bbb211d2ee458559fb8046281d

commit: 47502b833988c1206c2fe65eaf626858e49e4151 [log] [tgz]
author: Sameer Agarwal <sameeragarwal@google.com> Mon Feb 14 21:37:40 2022 -0800
committer: Sameer Agarwal <sameeragarwal@google.com> Mon Feb 14 21:45:34 2022 -0800
tree: 4d725dacbea0d6fb8bf9b1edb39b4f4ee75b1c5b
parent: bb299668105bed86d3b345a19b63f9d0f89d4f95 [diff]
diff --git a/docs/source/features.rst b/docs/source/features.rst
index 634be9d..956eb00 100644
--- a/docs/source/features.rst
+++ b/docs/source/features.rst

@@ -46,10 +46,11 @@
     computational cost in all of these methods is the solution of a
     linear system. To this end Ceres ships with a variety of linear
     solvers - dense QR and dense Cholesky factorization (using
-    `Eigen`_ or `LAPACK`_) for dense problems, sparse Cholesky
-    factorization (`SuiteSparse`_, `CXSparse`_ or `Eigen`_) for large
-    sparse problems, custom Schur complement based dense, sparse, and
-    iterative linear solvers for `bundle adjustment`_ problems.
+    `Eigen`_, `LAPACK`_ or `CUDA`_) for dense problems, sparse
+    Cholesky factorization (`SuiteSparse`_, `Apple's Accelerate`_,
+    `CXSparse`_ `Eigen`_) for large sparse problems, custom Schur
+    complement based dense, sparse, and iterative linear solvers for
+    `bundle adjustment`_ problems.
 
   - **Line Search Solvers** - When the problem size is so large that
     storing and factoring the Jacobian is not feasible or a low
@@ -62,6 +63,9 @@
   modern C++ threads based multithreading of the Jacobian evaluation
   and the linear solvers.
 
+* **GPU Acceleration** If your system supports `CUDA`_ then Ceres
+  Solver can use the Nvidia GPU on your system to speed up the solver.
+
 * **Solution Quality** Ceres is the `best performing`_ solver on the NIST
   problem set used by Mondragon and Borchers for benchmarking
   non-linear least squares solvers.
@@ -89,3 +93,5 @@
 .. _CXSparse: https://www.cise.ufl.edu/research/sparse/CXSparse/
 .. _automatic: http://en.wikipedia.org/wiki/Automatic_differentiation
 .. _numeric: http://en.wikipedia.org/wiki/Numerical_differentiation
+.. _CUDA : https://developer.nvidia.com/cuda-toolkit
+.. _Apple's Accelerate: https://developer.apple.com/documentation/accelerate/sparse_solvers

diff --git a/docs/source/installation.rst b/docs/source/installation.rst
index a924d95..3591901 100644
--- a/docs/source/installation.rst
+++ b/docs/source/installation.rst

@@ -104,6 +104,11 @@
   ``SuiteSparse``, and optionally used by Ceres directly for some
   operations.
 
+  TODO::
+
+    1. Add a more detailed note about Intel MKL.
+    2. Add detailed instructions about CUDA
+
   On ``UNIX`` OSes other than macOS we recommend `ATLAS
   <http://math-atlas.sourceforge.net/>`_, which includes ``BLAS`` and
   ``LAPACK`` routines. It is also possible to use `OpenBLAS

diff --git a/docs/source/nnls_solving.rst b/docs/source/nnls_solving.rst
index 9946645..a3225db 100644
--- a/docs/source/nnls_solving.rst
+++ b/docs/source/nnls_solving.rst

@@ -1325,16 +1325,17 @@
    Default:``EIGEN``
 
    Ceres supports using multiple dense linear algebra libraries for
-   dense matrix factorizations. Currently ``EIGEN`` and ``LAPACK`` are
-   the valid choices. ``EIGEN`` is always available, ``LAPACK`` refers
-   to the system ``BLAS + LAPACK`` library which may or may not be
-   available.
+   dense matrix factorizations. Currently ``EIGEN``, ``LAPACK`` and
+   ``CUDA`` are the valid choices. ``EIGEN`` is always available,
+   ``LAPACK`` refers to the system ``BLAS + LAPACK`` library which may
+   or may not be available. ``CUDA`` refers to Nvidia's GPU based
+   dense linear algebra library which may or may not be available.
 
    This setting affects the ``DENSE_QR``, ``DENSE_NORMAL_CHOLESKY``
    and ``DENSE_SCHUR`` solvers. For small to moderate sized probem
    ``EIGEN`` is a fine choice but for large problems, an optimized
-   ``LAPACK + BLAS`` implementation can make a substantial difference
-   in performance.
+   ``LAPACK + BLAS`` or ``CUDA`` implementation can make a substantial
+   difference in performance.
 
 .. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library_type
 

diff --git a/include/ceres/solver.h b/include/ceres/solver.h
index 35644c4..026fc1c0 100644
--- a/include/ceres/solver.h
+++ b/include/ceres/solver.h

@@ -364,23 +364,23 @@
     std::unordered_set<ResidualBlockId>
         residual_blocks_for_subset_preconditioner;
 
-    // Ceres supports using multiple dense linear algebra libraries
-    // for dense matrix factorizations. Currently EIGEN and LAPACK are
-    // the valid choices. EIGEN is always available, LAPACK refers to
-    // the system BLAS + LAPACK library which may or may not be
+    // Ceres supports using multiple dense linear algebra libraries for dense
+    // matrix factorizations. Currently EIGEN, LAPACK and CUDA are the valid
+    // choices. EIGEN is always available, LAPACK refers to the system BLAS +
+    // LAPACK library which may or may not be available. CUDA refers to Nvidia's
+    // GPU based dense linear algebra library, which may or may not be
     // available.
     //
-    // This setting affects the DENSE_QR, DENSE_NORMAL_CHOLESKY and
-    // DENSE_SCHUR solvers. For small to moderate sized problem EIGEN
-    // is a fine choice but for large problems, an optimized LAPACK +
-    // BLAS implementation can make a substantial difference in
-    // performance.
+    // This setting affects the DENSE_QR, DENSE_NORMAL_CHOLESKY and DENSE_SCHUR
+    // solvers. For small to moderate sized problem EIGEN is a fine choice but
+    // for large problems, an optimized LAPACK + BLAS or CUDA implementation can
+    // make a substantial difference in performance.
     DenseLinearAlgebraLibraryType dense_linear_algebra_library_type = EIGEN;
 
-    // Ceres supports using multiple sparse linear algebra libraries
-    // for sparse matrix ordering and factorizations. Currently,
-    // SUITE_SPARSE and CX_SPARSE are the valid choices, depending on
-    // whether they are linked into Ceres at build time.
+    // Ceres supports using multiple sparse linear algebra libraries for sparse
+    // matrix ordering and factorizations. Currently, SUITE_SPARSE and CX_SPARSE
+    // are the valid choices, depending on whether they are linked into Ceres at
+    // build time.
     SparseLinearAlgebraLibraryType sparse_linear_algebra_library_type =
 #if !defined(CERES_NO_SUITESPARSE)
         SUITE_SPARSE;

diff --git a/internal/ceres/types.cc b/internal/ceres/types.cc
index 8cfbc77..4824267 100644
--- a/internal/ceres/types.cc
+++ b/internal/ceres/types.cc

@@ -433,7 +433,7 @@
 #ifdef CERES_NO_CUDA
     return false;
 #else
-    return false;
+    return true;
 #endif
   }
commit	47502b833988c1206c2fe65eaf626858e49e4151	[log] [tgz]
author	Sameer Agarwal <sameeragarwal@google.com>	Mon Feb 14 21:37:40 2022 -0800
committer	Sameer Agarwal <sameeragarwal@google.com>	Mon Feb 14 21:45:34 2022 -0800
tree	4d725dacbea0d6fb8bf9b1edb39b4f4ee75b1c5b
parent	bb299668105bed86d3b345a19b63f9d0f89d4f95 [diff]