Hacker News new | ask | show | jobs
by lordofgibbons 216 days ago
At what quantization? And if it is in fact quantized below fp8, how is the performance impacted on all the various benchmarks?
1 comments

They claim they don't use quantization.

The reason for their speed is this chip: https://www.cerebras.ai/chip