Hacker News new | ask | show | jobs
by sdenton4 1144 days ago
Yeah there's really three options here... Throw everything in context, fine tune, or add external search a la RETRO.

The latter is definitely the cheapest option; updates are trivial.

1 comments

Yah... we really need some kind of architecture that juggles concept vectors around to external storage and does similarity search, etc, instead of forcing us to encode everything into giant tangles of coefficients.

GPT-4 seems to show that linear algebra definitely can do the job, but training is so expensive and the model gets so huge and inflexible.

It seems like having fixed format vectors of knowledge that the model can use-- denser and more precise than just incorporating tool results as tokens like OpenAI's plugin approach-- is a path forward towards extensibility and online learning.