|
|
|
|
|
by vlovich123
35 days ago
|
|
Even accepting the premise, it should be obviously true that 10 dumber models running 10x as fast != 1 smarter model. Otherwise engineering would just be a matter of throwing people at a problem when it’s very clear that 1 talented engineer can outperform a team of engineers or accomplish things the team would never have been able to. There’s also the assumption you’re making that a 10x smaller model is 10x dumber when it’s not - it’s a curve and some people seem to struggle with non linear effects |
|
If a smaller model tries ten things and comes to the same conclusion as the big model gets first try, then yeah 10x small = 1x big. Is that where we are at now? Idk probably not - but it’s not hard to imagine something like that emerging soon. There is already evidence that smaller models get some things _better_ than bigger models (e.g. https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... )
> There’s also the assumption you’re making that a 10x smaller model is 10x dumber when it’s not
That is not an assumption i am making. I said “a smaller model” not “a 10x smaller model”. Model speed and model “intelligence” are both non-linear.