Hacker News new | ask | show | jobs
by 2099miles 789 days ago
Yeah I say cost is the biggest thing. Why doesn’t everyone just use GPT 4 for everything or Gemini ultra + RAG with all documents in the rag system with the best embedding model

Among other things because it’s way too expensive and narrowing your scope cuts huge costs and isn’t hard to do at a high level

1 comments

There is also the problem that most of the LLMs of today will somehow lose (or ignore) the middle if the context and prefer beginning and/or end.