| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tempusalaria 253 days ago
	Lots of situations, here are 2 I’ve faced recently (cannot give too much detail for privacy reasons, but should be clear enough) 1) low latency desired, long user prompt 2) function runs many parallel requests, but is not fired with common prefix very often. OpenAI was very inconsistent about properly caching the prefix for use across all requests, but with Anthropic it’s very easy to pre-fire