|
|
|
|
|
by dpzmick
1841 days ago
|
|
I believe there's a hint towards the end of the article: > Note: Perlmutter’s “AI performance” is based on Nvidia’s half-precision numerical format (FP16 Tensor Core) with Nvidia’s sparsity feature enabled. FP16 is a 16 bit floating point format. FLOPS for top 500 are measured with LINPACK HPL, which says it is over 64 bit floating point values (I think): > HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark. (from https://www.netlib.org/benchmark/hpl/) This isn't totally disingenuous though. These FP16 operations are very useful for some kinds of calculations. |
|
ML for example. If you use 16bit precision you can fit the model in half the memory and your lookups are twice as fast. Newer GPU models offer "mixed precision mode" It takes some doing to get that working in your tooling though.