|
|
|
|
|
by ramoz
811 days ago
|
|
Depending on the application, you would do continued pretraining over new tokens to gain new knowledge. 100M tokens is applicable here. You would fine-tune, certainly, for domain-specific tasks, and would curate a subset of the 100M tokens. Total tokens in alignment study references is 1,000,000. RAG is a hacky way to interpolate new knowledge with a base model. Not always reliable nor easy to integrate into task-specific workflows. |
|
We’re building some “smart search” functionality for some teams and I start to wonder if a traditional search results list (i.e. sans the LLM, or used only to rewrite the user query) with the document chunks wouldn’t be better than blindly taking the top N and feeding them to the LLM to produce some response.
E.g. we have some docs about specific supermarket chains, but the word “supermarket” might not appear at all in them, but the user query might be “show me what we have about supermarkets”. Now the embeddings hopefully will place the word “supermarket” close to, say, “Costco”, but they might also place it closer to “shopping center”, and we might have docs about shopping centers that could rank higher. So we might take the top 5 docs and send them to the LLM, but the docs the user was after might have been in 7th and 9th position, nowhere to be seen by the LLM nor the user.