| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lmeyerov 65 days ago
	We find it true in Louie.ai evals (ai for investigations), about a 10-20% lift which meaningful. It'd measured here: botsbench.com . Unfortunately, undesirable in practice due to people being token-constrained even before. One case is retrying only on failure, but even that is a bit tricky...