|
|
|
|
|
by latchkey
852 days ago
|
|
Performance for GPUs isn't just speed, but also power efficiency. The complexity of GPUs doesn't lend itself to just being solved with better tooling. They are also not going to get cheaper... especially the high end ones with tons of HBM3 memory. Given that data centers only have so much power and AI really needs to be in the same data center as the data, if you can squeeze out a bit more power efficiency so you can fit more cards, you are getting gains there as well. When I was mining ethereum, the guy who wrote the mining software used an oscilloscope to squeeze an an extra 5-10% out of our cards and that was after having used them for years. That translated to saving about 1 MW of power across all of our data centers. Let me also remind you that GPUs are silicon snowflakes. No two perform exactly the same. They all require very specific individual tuning to get the best performance out of them. This tuning is not even at the software level, but actual changes to voltage/memory timings/clock speeds. |
|
I suspect a lot of AI inference (thought probably not the majority) will happen on mobile devices in the future. There power is also at a premium, and less fungible with money.