|
|
|
|
|
by stuntprogrammer
3707 days ago
|
|
Pascal won't be cheap so comparing to a top-end Xeon E5 v4, it's about 7x the theoretical FP64 performance (assuming the Xeon is 2.2GHz * 22 cores * 8 avx pipes per * 2 for FMA per socket, given the price range). Similar story at FP32. However, the GPU wins on FP16. For historical performance, just pick one of the machines that did x teraflops. E.g. the first teraflop computer used ~6000 200MHz pentium pro chips around 1996. |
|
In describing the benchmark, they say,
In an attempt to obtain uniformity across all computers in performance reporting, the algorithm used in solving the system of equations in the benchmark procedure must conform to LU factorization with partial pivoting. In particular, the operation count for the algorithm must be 2/3 n^3 + O(n^2) double precision floating point operations. This excludes the use of a fast matrix multiply algorithm like "Strassen's Method" or algorithms which compute a solution in a precision lower than full precision (64 bit floating point arithmetic) and refine the solution using an iterative approach.
So, to summarize, if in 2000 the fastest supercomputer on the planet ran at about 4.9 TFLOPs, does that mean, apples-apples on the LINPACK (and only the LINPACK), that Pascal today would outperform that Supercomputer?