Hacker News new | ask | show | jobs
by xrd 736 days ago
It feels like the two dumb ways to customize an open LLM are fine tuning and RAG. The former is expensive and complicated, the latter adds complexity to your queries but doesn't require up front compute for retraining.

I couldn't tell how expensive this is up front, or what complexity it adds to the setup. Anyone know?

It's definitely an interesting idea but if you have to pay $100k for all that LoRA, what margins are left over?

1 comments

What do you think is complicated about RAG? I'm not arguing that it's effortless, but it's not that complicated? Genuinely interested to hear other people's pain points.
Well, you need to generate embeddings usually, and then query and filter those, but it isn't that complicated for sure.