Hacker News new | ask | show | jobs
by aesthesia 2 days ago
There are some glaring local errors that make this analysis less than trustworthy. For instance, an assumption that corporate income tax applies directly to revenue, or a supposedly generous assumption that GPUs will fully depreciate after 3 years (6-year-old A100s are still in very high demand!). I would love to read a really well thought through investigation of inference costs and how they relate to token pricing, but I have low confidence that this is it.
2 comments

> GPUs will fully depreciate after 3 years (6-year-old A100s are still in very high demand!)

Depreciation is a tax thing. While it is supposed to track useful life, it almost never does.

For example, houses are depreciated on a 28-year schedule. I'm typing this from a house built in 1902....

Google has yet to decommission any of its Trilliums, and the V1s shipped in 2015.

The prices to rent V2 (2017) and later are on https://cloud.google.com/tpu/pricing .

Yep, in their analysis depreciation meant "get no useful work out of the GPU after this point," though.
Oh, just noticed one other very significant error: they evaluate revenue using input token pricing while counting capacity using generated tokens per second. There's a big gap between input and output token pricing, and between prefill TPS and generation TPS.