| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by energy123 321 days ago
	No, you're talking about costs to user, which are oversimplifications of the costs that providers bear. One output token with a million input tokens is incredibly cheap for providers

1 comments

danenania 321 days ago

> One output token with a million input tokens is incredibly cheap for providers

Source? Afaik this is incorrect.

link

danielbln 321 days ago

Chevk out any LLM API providers pricing. Output tokens are always significantly more expensive than input (which can also be cached).

link

danenania 321 days ago

Input tokens usually dominate output tokens by a lot more than 2x though. It’s often 10x or more input. It can even easily be 100x or more. Again in realistic workflows.

Caching does help the situation, but you always at least pay the initial cache write. And prompts need to be structured carefully to be cacheable. It’s not a free lunch.

link