| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fallmonkey 301 days ago
	The estimation for output token is too low since one reasoning-enabled response can burn through thousands of output tokens. Also low for input tokens since in actual use there're many context (memory, agents.md, rules, etc) included nowadays.

1 comments

atq2119 301 days ago

When using APIs, you pay for reasoning tokens like you do for actual outputs. So, the estimation on a per-token basis is not affected by reasoning.

What reasoning affects is the ratio of input to output tokens, and since input tokens are cheaper, that may well affect the economics in the end.

link

fallmonkey 301 days ago

Correct, and with reasoning, the ratio is totally off. As others have pointed out, actual usage is way higher (much more than 3-5x) than the estimation in the article, which is probably for very trivial users.

link