Hacker News new | ask | show | jobs
by firejake308 763 days ago
True, this is how RAG works, but this is why I prefer to use open-source LLMs for RAG: because the token costs are less opaque and I can control how many chunks I pull fromthe database to manage my costs
1 comments

I believe it will get better and more efficient as we go. On a side note, OpenAI seems to release products before they are ready and they evolve as they go.
> I believe it will get better and more efficient as we go.

Yes of course. The point remains: the LLM has to process the data somehow.

If you are concerned about costs and token usage then switch to a provider that works for your problem (Flash Gemini looks very interesting..)