| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by monooso 78 days ago

Paul Kinlan published a blog post a couple of days ago [1] with some interesting data, that show output tokens only account for 4% of token usage.

It's a pretty wide-reaching article, so here's the relevant quote (emphasis mine):

> Real-world data from OpenRouter’s programming category shows 93.4% input tokens, 2.5% reasoning tokens, and just 4.0% output tokens. It’s almost entirely input.

[1]: https://aifoc.us/the-token-salary/

3 comments

verdverm 78 days ago

My own output token ratio is 2% (50% savings on the expensive tokens, I include thinking in this, which is often more). I have similar tone and output formatting system prompt content.

link

kinlan 77 days ago

That's actually useful to know and it aligns with what I see (I wrote the cost post)

link

weird-eye-issue 78 days ago

Yes but with prompt caching decreasing the cost of the input by 90% and with output tokens not being cached and costing more than what do you think that results in?

link

wongarsu 78 days ago

However output tokens are 5-10 times more expensive. So it ends up a lot more even on price

link

weird-eye-issue 78 days ago

Even more than that in practice once you factor in prompt caching

link

kinlan 77 days ago

I think we still skew back to an insanely high input token ratio when you consider agentic loops. For example, when I see the tools I use do a web fetch or a search or other tool use, it's an incredibly high number of new input tokens.

link