| HN Mirror

The ever-circulating rumour is 1.7T - 1.8T for the whole thing. But it is not very substantiated, mostly started by SemiAnalysis and geohot based on rather loose speculation (such as API latency and price), and not much solid evidence to confirm it after that.

And of course, it must have changed substantially with GPT-4-Turbo and GPT-4o. It would make sense if the cost reduction was larger than the price reduction, they probably have a higher profit margin now, and the price reduction has been very significant since GPT-4 release.