Speed up InvertPSDMatrix

For matrices of size <= 4, Eigen implements an optimized matrix
inverse, which is orders of magnitude faster than computing the
Cholesky factorization and inverting it. This can have a significant
impact on the performance of the SchurEliminator.

This change implements this optimization, adds tests and benchmarks.

Before this change on my MacBook Pro

-----------------------------------------------------------------------------------
Benchmark                                            Time           CPU Iterations
-----------------------------------------------------------------------------------
BenchmarkFixedSizedInvertPSDMatrix<1>                0 ns          0 ns 1000000000
BenchmarkFixedSizedInvertPSDMatrix<2>              102 ns        101 ns    6504851
BenchmarkFixedSizedInvertPSDMatrix<3>     	   164 ns        164 ns    4297669
BenchmarkFixedSizedInvertPSDMatrix<4>              200 ns        200 ns    3933623
BenchmarkFixedSizedInvertPSDMatrix<5>              353 ns        353 ns    1930454
BenchmarkFixedSizedInvertPSDMatrix<6>              428 ns        427 ns    1629074
BenchmarkFixedSizedInvertPSDMatrix<7>              559 ns        558 ns    1211639
BenchmarkFixedSizedInvertPSDMatrix<8>              527 ns        527 ns    1000000
BenchmarkFixedSizedInvertPSDMatrix<9>              873 ns        873 ns     902713
BenchmarkFixedSizedInvertPSDMatrix<10>             892 ns        892 ns     787410
BenchmarkFixedSizedInvertPSDMatrix<11>            1201 ns       1201 ns     564334
BenchmarkFixedSizedInvertPSDMatrix<12>            1081 ns       1080 ns     588359
BenchmarkDynamicallyInvertPSDMatrix/1              322 ns        322 ns    2244892
BenchmarkDynamicallyInvertPSDMatrix/2              362 ns        362 ns    1869693
BenchmarkDynamicallyInvertPSDMatrix/3              455 ns        455 ns    1604092
BenchmarkDynamicallyInvertPSDMatrix/4              443 ns        443 ns    1578062
BenchmarkDynamicallyInvertPSDMatrix/5              648 ns        647 ns     963232
BenchmarkDynamicallyInvertPSDMatrix/6              756 ns        755 ns     899766
BenchmarkDynamicallyInvertPSDMatrix/7              906 ns        905 ns     740506
BenchmarkDynamicallyInvertPSDMatrix/8              885 ns        885 ns     790657
BenchmarkDynamicallyInvertPSDMatrix/9             1219 ns       1219 ns     600503
BenchmarkDynamicallyInvertPSDMatrix/10            1267 ns       1266 ns     534555
BenchmarkDynamicallyInvertPSDMatrix/11            1580 ns       1579 ns     469591
BenchmarkDynamicallyInvertPSDMatrix/12            1366 ns       1365 ns     514513

after this change:

-----------------------------------------------------------------------------------
Benchmark                                            Time           CPU Iterations
-----------------------------------------------------------------------------------
BenchmarkFixedSizedInvertPSDMatrix<1>                0 ns          0 ns 1000000000
BenchmarkFixedSizedInvertPSDMatrix<2>                1 ns          1 ns 1000000000
BenchmarkFixedSizedInvertPSDMatrix<3>                1 ns          1 ns  514399512
BenchmarkFixedSizedInvertPSDMatrix<4>                2 ns          2 ns  320587683
BenchmarkFixedSizedInvertPSDMatrix<5>              372 ns        372 ns    1856986
BenchmarkFixedSizedInvertPSDMatrix<6>              446 ns        446 ns    1552502
BenchmarkFixedSizedInvertPSDMatrix<7>              571 ns        570 ns    1208021
BenchmarkFixedSizedInvertPSDMatrix<8>              586 ns        584 ns    1090988
BenchmarkFixedSizedInvertPSDMatrix<9>             1003 ns       1001 ns     753279
BenchmarkFixedSizedInvertPSDMatrix<10>            1074 ns       1070 ns     689974
BenchmarkFixedSizedInvertPSDMatrix<11>            1361 ns       1351 ns     545388
BenchmarkFixedSizedInvertPSDMatrix<12>            1160 ns       1158 ns     615742
BenchmarkDynamicallyInvertPSDMatrix/1              326 ns        326 ns    2206552
BenchmarkDynamicallyInvertPSDMatrix/2              362 ns        361 ns    1820982
BenchmarkDynamicallyInvertPSDMatrix/3              432 ns        431 ns    1696361
BenchmarkDynamicallyInvertPSDMatrix/4              473 ns        472 ns    1294115
BenchmarkDynamicallyInvertPSDMatrix/5              657 ns        656 ns     917888
BenchmarkDynamicallyInvertPSDMatrix/6              804 ns        802 ns     884050
BenchmarkDynamicallyInvertPSDMatrix/7              936 ns        935 ns     679565
BenchmarkDynamicallyInvertPSDMatrix/8              915 ns        915 ns     695548
BenchmarkDynamicallyInvertPSDMatrix/9             1299 ns       1293 ns     583256
BenchmarkDynamicallyInvertPSDMatrix/10            1300 ns       1296 ns     562959
BenchmarkDynamicallyInvertPSDMatrix/11            1617 ns       1610 ns     480393
BenchmarkDynamicallyInvertPSDMatrix/12            1380 ns       1379 ns     503934

Change-Id: Id0d3bbe6d610ee7a3004bb8e2b657b90307b9805
4 files changed
tree: df5b32b0b716046b37661fa6e6cca6663c4dadb8
  1. .clang-format
  2. .gitignore
  3. .travis.yml
  4. BUILD
  5. CMakeLists.txt
  6. CONTRIBUTING.md
  7. LICENSE
  8. README.md
  9. WORKSPACE
  10. bazel/
  11. cmake/
  12. config/
  13. data/
  14. docs/
  15. examples/
  16. include/
  17. internal/
  18. package.xml
  19. scripts/
  20. travis/
README.md

Build Status

Ceres Solver

Ceres Solver is an open source C++ library for modeling and solving large, complicated optimization problems. It is a feature rich, mature and performant library which has been used in production at Google since 2010. Ceres Solver can solve two kinds of problems.

  1. Non-linear Least Squares problems with bounds constraints.
  2. General unconstrained optimization problems.

Please see ceres-solver.org for more information.