Hacker News new | ask | show | jobs
by mvkel 763 days ago
It seems like few shot prompting and providing some examples to LLMs with large context windows vastly out performs any amount of rag, or fine tuning.

Aren't rag and fine tuning fundamentally flawed, because they only play at the surface of the model? Like sprinkles on the top of the cake, expecting them to completely change the flavor. I know LoRA is supposed to appropriately weight the data, but the results say that's not the solution.

Also anecdotal, but way less work!

4 comments

Long context windows get confused, so shorter is better, and they cannot fit everything in general. I'm not sure where you are seeing results that say otherwise.

RAG is effectively prompt context optimization, so categorically rejecting doing that doesn't make sense to me. Maybe if models internalized that or scaled... But they don't.

Totally agree. Every decision on what context to put in a context window is “RAG”. Somehow the term was co-opted to refer to “context selected by vector similarity”, so presumably when people say “is RAG hanging around”, what they mean is “are vectors a complete solution”, to which the answer is obviously “no”. But you still need some sort of _relevance function_ to pick your context - even if it’s pin-the-tail-on-the-donkey. That’s “RAG”.

Doesn’t make sense to ask “will we still have to curate our context?” The answer is of course you will.

RAG and fine-tuning are very different. Few-shot prompting and RAG are both variants of in-context learning.
That's definitely my experience as well, sufficiently large context window with a capable enough general purpose LLM solves lots if not all of the problems rag/fine tuning claim to solve.
I've also found (anecdotal) significant success in just throwing in available context before prompting. I've written multiple automations in this way as well.