Hacker News new | ask | show | jobs
by mikecaulley 799 days ago
This doesn’t consider compute cost; the RAG model is much more efficient compared to infinite context length.
1 comments

Agreed. I think that RAG implemented via tool-calling, with multiple agents talking to each other, is a much much more likely evolution in the future versus a single unified model.

I could very well be wrong! But we wouldn't want LLMs to be performing lots of arithmetic calculations via exploiting hidden parts of themselves that do linear regression or whatever, far better to just give them the calculator and get results faster and cheaper. Similarly, we can give them a search engine (RAG) and let them figure it out more efficiently.