| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lmeyerov 382 days ago

There is an interesting subtlety to what you are saying -

* Inferencing is getting cheaper: 40-80% / yr in hardware + 40-80% / yr in software. Annual numbers will change, but for next 5 years, seems fine to assume continued improvement at meaningful levels.

* Subtlety 1: What % of that improvement is only accessible to large providers with bigger workloads and bigger capitalization? Think more opportunities for batching, specialization, bulk purchasing, ... . That edge is where their ability to both underprice others + have a profitability margin here vs being a loss leader. A lot of the annual hw/sw improvement is commoditized for low-scale. Eg, hyperscalers pay 10% less to Nvidia than the next tier of smaller buyers, and I'm unsure of how to think of the margin benefit of custom hw

* Subtlety 2: if things keep going well, there may be a limited window around wirkloads "tricky enough that only they can do it" , eg, if commodity + oss using 2nd tier providers are "good enough" and also reach low COGS, so it's just regular data center stuff, all this becomes unclear again. I can likewise imagine that becoming true of the wider use case of Text, while image/video/audio ends up more specialized