| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Traubenfuchs 374 days ago
	I bet there is a set of repetitive single, or two, question user requests that makes out a sizeable amount of all requests. The models are so expensive to run, 1% would be enough. Much less than 1%. To make it less obvious they probably have a big set of response variants. I don't see how they would not do this. They probably also have cheap code or cheap models that normalize requests to increase cache hit rate.