|
|
|
|
|
by WhitneyLand
921 days ago
|
|
I am not buying this at all. But I’m not a hardware guy so maybe someone can help with why this is not true: - Crypto hardware needed SHA256 which is basically tons of bitwise operations. That’s way simpler than the tons of matrix ops transformers need. - NVidia wasn’t focused on crypto acceleration as a core competency. There are focussed on this, and are already years down the path. - One of the biggest bottlenecks is memory bandwidth. That is also not cheap or simple to do. - Say they do have a great design. What process are they going to build it on? There are some big customers out there waiting for TMSC space already. Maybe they have IP and it’s more of a patent play. (I mention crypto only as an example of custom hardware competing with a GPU) |
|
This is precisely why people are trying to put logic into memory instead of just making the logic chips simpler. Compute being 10x faster doesn't mean much when you want real-time, near-zero latency in the current day (and potentially, future) ML workloads. Memory bandwith for low batches are much more important, and even though this chip comes with HBM3E (which is cutting edge), that by itself won't make this faster than H200/MI300X.