|
|
|
|
|
by lmeyerov
65 days ago
|
|
We find it true in Louie.ai evals (ai for investigations), about a 10-20% lift which meaningful. It'd measured here: botsbench.com . Unfortunately, undesirable in practice due to people being token-constrained even before. One case is retrying only on failure, but even that is a bit tricky... |
|