commit | aefd37b18252a2a49a4825ca122cf2836dbcefaf | [log] [tgz] |
---|---|---|
author | Ahmed Taei <ataei@google.com> | Mon Aug 09 21:24:17 2021 -0700 |
committer | Ahmed Taei <ataei@google.com> | Mon Aug 09 21:24:17 2021 -0700 |
tree | 17ff6b4833d2b5662412676e85195548615646a0 | |
parent | dc20db303207841720052527bfc50cae8052fba5 [diff] |
Refactor small_blas_gemm_benchmark This allows benchmarking both dynamic and static problem sizes. On my X86 2Ghz machine: CPU Caches: L1 Data 32 KiB (x48) L1 Instruction 32 KiB (x48) L2 Unified 1024 KiB (x48) L3 Unified 39424 KiB (x2) ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------- BM_MatrixMatrixMultiplyEigen_Static_2x3x4 10.9 ns 10.9 ns 63960496 BM_MatrixMatrixMultiplyEigen_Static_3x3x3 11.9 ns 11.9 ns 58093785 BM_MatrixMatrixMultiplyEigen_Static_4x4x4 24.9 ns 24.9 ns 28540133 BM_MatrixMatrixMultiplyEigen_Static_8x8x8 258 ns 258 ns 2755921 BM_MatrixMatrixMultiplyEigen_Static_9x9x3 255 ns 255 ns 2778628 BM_MatrixMatrixMultiplyEigen_Static_9x3x3 37.7 ns 37.7 ns 18556359 BM_MatrixMatrixMultiplyEigen_Static_3x9x9 220 ns 220 ns 3195163 BM_MatrixMatrixMultiplyEigen_Dynamic_2x3x4 37.8 ns 37.7 ns 18491162 BM_MatrixMatrixMultiplyEigen_Dynamic_3x3x3 48.8 ns 48.8 ns 13724884 BM_MatrixMatrixMultiplyEigen_Dynamic_4x4x4 74.5 ns 74.5 ns 9313146 BM_MatrixMatrixMultiplyEigen_Dynamic_8x8x8 271 ns 271 ns 2595807 BM_MatrixMatrixMultiplyEigen_Dynamic_9x9x3 259 ns 259 ns 2688515 BM_MatrixMatrixMultiplyEigen_Dynamic_9x3x3 123 ns 123 ns 5792115 BM_MatrixMatrixMultiplyEigen_Dynamic_3x9x9 236 ns 236 ns 2963896 BM_MatrixMatrixMultiplyNaive_Static_2x3x4 12.2 ns 12.2 ns 56472772 BM_MatrixMatrixMultiplyNaive_Static_3x3x3 15.5 ns 15.5 ns 44346456 BM_MatrixMatrixMultiplyNaive_Static_4x4x4 41.5 ns 41.5 ns 17196984 BM_MatrixMatrixMultiplyNaive_Static_8x8x8 199 ns 199 ns 3561730 BM_MatrixMatrixMultiplyNaive_Static_9x9x3 148 ns 148 ns 4764814 BM_MatrixMatrixMultiplyNaive_Static_9x3x3 38.4 ns 38.4 ns 17259019 BM_MatrixMatrixMultiplyNaive_Static_3x9x9 115 ns 115 ns 6104752 BM_MatrixMatrixMultiplyNaive_Dynamic_2x3x4 9.66 ns 9.66 ns 74722971 BM_MatrixMatrixMultiplyNaive_Dynamic_3x3x3 13.0 ns 13.0 ns 53435308 BM_MatrixMatrixMultiplyNaive_Dynamic_4x4x4 47.8 ns 47.8 ns 14358184 BM_MatrixMatrixMultiplyNaive_Dynamic_8x8x8 200 ns 200 ns 3572809 BM_MatrixMatrixMultiplyNaive_Dynamic_9x9x3 104 ns 104 ns 6793797 BM_MatrixMatrixMultiplyNaive_Dynamic_9x3x3 34.0 ns 34.0 ns 20790695 BM_MatrixMatrixMultiplyNaive_Dynamic_3x9x9 130 ns 130 ns 5170402 BM_MatrixTransposeMatrixMultiplyEigen_Static_2x3x4 10.3 ns 10.3 ns 69105234 BM_MatrixTransposeMatrixMultiplyEigen_Static_3x3x3 28.9 ns 28.9 ns 24478934 BM_MatrixTransposeMatrixMultiplyEigen_Static_4x4x4 23.7 ns 23.7 ns 29351926 BM_MatrixTransposeMatrixMultiplyEigen_Static_8x8x8 233 ns 233 ns 2929398 BM_MatrixTransposeMatrixMultiplyEigen_Static_9x9x3 211 ns 211 ns 3287409 BM_MatrixTransposeMatrixMultiplyEigen_Static_9x3x3 26.5 ns 26.5 ns 26515136 BM_MatrixTransposeMatrixMultiplyEigen_Static_3x9x9 196 ns 196 ns 3594314 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_2x3x4 9.05 ns 9.05 ns 77621001 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_3x3x3 11.1 ns 11.1 ns 62227812 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_4x4x4 25.5 ns 25.5 ns 27356089 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_8x8x8 248 ns 248 ns 2834983 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_9x9x3 229 ns 229 ns 3082369 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_9x3x3 28.4 ns 28.4 ns 24318629 BM_MatrixTransposeMatrixMultiplyEigen_Dynamic_3x9x9 229 ns 229 ns 3091288 BM_MatrixTransposeMatrixMultiplyNaive_Static_2x3x4 11.0 ns 11.0 ns 63773538 BM_MatrixTransposeMatrixMultiplyNaive_Static_3x3x3 19.1 ns 19.1 ns 37003139 BM_MatrixTransposeMatrixMultiplyNaive_Static_4x4x4 49.1 ns 49.1 ns 14142301 BM_MatrixTransposeMatrixMultiplyNaive_Static_8x8x8 244 ns 244 ns 2874755 BM_MatrixTransposeMatrixMultiplyNaive_Static_9x9x3 140 ns 140 ns 4992156 BM_MatrixTransposeMatrixMultiplyNaive_Static_9x3x3 46.2 ns 46.2 ns 15068317 BM_MatrixTransposeMatrixMultiplyNaive_Static_3x9x9 112 ns 112 ns 6213574 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_2x3x4 9.74 ns 9.74 ns 72155001 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_3x3x3 11.5 ns 11.5 ns 60070577 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_4x4x4 52.5 ns 52.5 ns 13473642 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_8x8x8 224 ns 224 ns 3124264 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_9x9x3 98.0 ns 98.0 ns 7199292 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_9x3x3 34.7 ns 34.6 ns 20203685 BM_MatrixTransposeMatrixMultiplyNaive_Dynamic_3x9x9 105 ns 105 ns 6653151 Change-Id: Iee403b4d27801d1614ecc1f78a1f8c0011514bd7
Ceres Solver is an open source C++ library for modeling and solving large, complicated optimization problems. It is a feature rich, mature and performant library which has been used in production at Google since 2010. Ceres Solver can solve two kinds of problems.
Please see ceres-solver.org for more information.