|
|
|
|
|
by irthomasthomas
15 days ago
|
|
I have had a better experience with my own use. I use it every day and it rarely fails to improve tasks. Perhaps the prompts and rubrics make a difference. And finding bugs is one of the better use cases because it is essentially a search problem. As long as models are non-deterministic and there is some diversity in training data, then an ensemble that iterates on the problem is more likely to cover the ground needed to find solve a problem. Some tasks benefit from this approach more than others. There was a paper from google on a version they made which was very similar and achieved SOTA then on planning and pathfinding benchmarks. edit: Mind Evolution paper
https://deepmind.google/research/publications/122391/ (That was a month after I published llm-consortium :)
https://xcancel.com/karpathy/status/1870692546969735361 |
|