Hacker News new | ask | show | jobs
by danielfalbo 166 days ago
How do we measure this is any better than just using 1 good model?
2 comments

One day someone will actually build something with an LLM and do a write-up of it, but until then we'll just keep reading about tooling.
Anecdotal experience, but when bugfixing I personally find if a model introduces a bug, it has a hard time spotting and fixing it, but when you give the code to another model it can instantly spot it (even if it's a weaker model overall).

So I can well imagine that this sort of approach could work very well, although agree with your sentiment that measurement would be good.