| Go and measure it yourself, if you have one :) https://github.com/Mysticial/Flops/ You can also get a theoretical computation of the Flops, which matches nicely with the experimental measurement. You have to take into account: - the clock frequency (~3.9 GHz on multithreaded workloads on my machine) - the number of cores (16) - the reciprocal throughput of the FMA instruction (~.5, that is, 2 instructions per clock cycle) - the number of flops per instruction (2 for the FMA instruction, that is, 1 multiply + 1 add) - the SIMD vector width (4 for double, 8 for float). Putting it together: 3.9e9 * 16 * 2 * 2 * 4 = 998.4 GFlops (double) 3.9e9 * 16 * 2 * 2 * 8 = 1996.8 GFlops (single) The measured values on my machine are a bit different, but close (1070 and 2151 respectively). References: https://www.agner.org/optimize/instruction_tables.pdf https://www.agner.org/forum/viewtopic.php?t=56 https://gadgetversus.com/processor/amd-ryzen-9-5950x-gflops-... |