|
|
|
|
|
by modeless
805 days ago
|
|
Intel Gaudi 3 has more interconnect bandwidth than this has memory bandwidth. By a lot. I guess they can't be fairly compared without knowing the TCO for each. I know in the past Google's TPU per-chip specs lagged Nvidia but the much lower TCO made them a slam dunk for Google's inference workloads. But this seems pretty far behind the state of the art. No FP8 either. |
|
From the Meta post: "This chip’s architecture is fundamentally focused on providing the right balance of compute, memory bandwidth, and memory capacity for serving ranking and recommendation models."
Optimizing for ranking/recommendation models is very different from general purpose training/inference.