|
|
|
|
|
by visarga
1641 days ago
|
|
It's language modelling with search engine in-the-loop. Instead of training GPT-3 with 178B weights, you train a 25x smaller model and allow it to retrieve useful snippets from a large text index as additional information. This solves the problem of very large models and the problem of updating an already trained model, as you can swap the text corpus with a newer one. The model learns mostly syntax, burning less trivia in its weights than a regular LM as it can simply copy the relevant information from the index. This development was bound to happen as large LMs are expensive to use and it was an obvious idea. We've had these semantic search text indices for a few years already[1], they just weren't combined with text generation. [1] https://github.com/spotify/annoy |
|