Hacker News new | ask | show | jobs
by twothreeone 621 days ago
https://finance.yahoo.com/news/cerebras-launches-world-faste...
2 comments

Nothing for the training part of MLPerf's benchmark. If they're competing just on inference, then they have stiff competition from specialized NPU-for-inference makers like Hailo (see: it's even part of the official Raspberry Pi AI kit), Qualcomm, tons of other players, and also some players using optics instead of electrons for inference such as Lightmatter, and also SIMD on highly abundant CPU servers which are never in shortage unlike GPUs (and have recently gotten support for specialized inference ops besides simply SIMD ones).
This isn't a benchmark, it's a press release. MLPerf has an inference component so they could have released numbers, but they chose not to.

At the end of the day it's all about performance per dollar/TCO, too, not just raw perf. A standardized benchmark helps to evaluate that.

My guess is that they neglected the software component (hardware guys always disdain software) and have to bend over backwards to get their hardware to run specific models (and only those specific models) well. Or potentially common models don't run well because their cross-chip interconnect is too slow.

MLPerf brings in exactly zero revenue. If they have sold every chip they can make for the next 2+ years, why would they be diverting resources to MLPerf benchmarking?

Artificial analysis does good API provider inference benchmarking and has evaluated Cerebras, Groq, Sambanova, the many Nvidia-based solutions, etc. IMO it makes way more sense to benchmark actual usable end points rather than submit closed and modified implementations to mlcommons. Graphcore had the fastest BERT submission at one point (when BERT was relevant lol) and it didn't really move the needle at all.

With Artificial Analysis I wonder if model tweaks are detectable. That’s the benefit of a standardized benchmark, you’re testing the hardware. If some inference vendor changes Llama under the hood, the changes are known. And of course if you don’t include precise repro. instructions in your standardized benchmark, nobody can tell how much money you’re losing (that is, how many chops are serving your requests).