| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ryao 384 days ago
	Last year, I took the time to read through public documents and estimated that their annual production was limited to ~300 wafers per year from TSMC. That is not Nvidia level scale, but it is scale. There are many companies that sell tokens from an API and many more that need hardware to compute tokens. Cerebras posted a comparison of hardware options for these companies, so evaluating it as such is meaningful. It is perhaps less meaningful to the average person who cannot afford the barrier to entry to afford this hardware, but there are plenty of people curious what the options are for the companies that sell tokens through APIs, as those impact available capacity.

1 comments

latchkey 383 days ago

> There are many companies that sell tokens from an API

I was just at Dell Tech World and they proudly displayed a slide during the CTO keynote that said:

"Cost per token decreased 4 orders of magnitude"

Personally speaking, not a business I'd want to get into.

link

ryao 383 days ago

Some context is needed for this. The only way to get a 4 orders of magnitude difference would be to compare incomparable things, like OpenAI’s most expensive model versus llama 3.1 8B.

link