Hacker News new | ask | show | jobs
by ofermend 1032 days ago
It's not impossible that fine-tuning would also help RAG. but it's certainly not guaranteed and hard to control. Fine-tuning essentially changes the weights of the model, and might result in other, potentially negative outcome, like loss of other knowledge of capabilities of the resulting fine-tuned LLM.

Other considerations: (A) would you fine-tune daily? weekly? as data changes? (B) Cost and availability of GPUs (there's a current shortage)

My experience is that RAG is the way to go, at least right now.

But you have to make sure your retrieval engine work optimally: getting the very most relevant pieces of text from your data: (1) using a good chunking strategy that's better than arbitrary 1K or 2K chars (2) using a good embedding model (3) Using hybrid search, and a few other things like that.

Certainly the availability of longer sequence models is a big help

Sharing this relevant discussion from LinkedIn: https://www.linkedin.com/feed/update/urn:li:activity:7101638...