|
|
|
|
|
by popinman322
854 days ago
|
|
vs RAG: RAG is good for searching across >billions of tokens and providing up-to-date information to a static model. Even with huge context lengths it's a good idea to submit high quality inputs to prevent the model from going off on tangents, getting stuck on contradictory information, etc.. vs fine tuning: smaller, fine-tuned models can perform better than huge models in a decent number of tasks. Not strictly fine-tuning, but for throughput limited tasks it'll likely still be better to prune a 70B model down to 2B, keeping only the components you need for accurate inference. I can see this model being good for taking huge inputs and compressing them down for smaller models to use. |
|