|
|
|
|
|
by patall
91 days ago
|
|
Maybe a naive question: given that they see better performance with more passes but the effect hits a limit after a few passes, would performance increase if they used different models per pass, i.e leanstral, kimi, qwen and leanstral again instead of 4x leanstral? |
|
It does actually significantly boost performance. There was an article on here about it recently, I'll see if I can find it.
Edit: https://news.ycombinator.com/item?id=44630724
They found the more different the models were (the less overlap in correctly solved problems), the more it boosted the score.