| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hagen8 96 days ago
	Well, the question is what is contributing to the usage. Because as the context grows, the amount of input tokens are increasing. A model call with 800K token as input is 8 times more expensive than a model call with 100K tokens as input. Especially if we resume a conversation and caching does not hit, it would be very expensive with API pricing.