|
|
|
Ask HN: Do modern AI engines still need to do full re-trainings?
|
|
11 points
by zepearl
695 days ago
|
|
I learned about ~AI algorithms in the 90s: backprocessing & clustering networks, and a little bit of genetic algos. I then focused & programmed & played for a while with the model of the "backpropagation" network, until the early 2000' => it was fun, but not usable in my context. I then stopped fiddling with it and became inactive in this context. An important property of a backpropagation network was (as much as I know) that it had to be fully re-trained whenever inputs changed (values of existing ones changed or inputs/outputs were removed/added). Question: Is it still like that for the currently fancy algos (the ones developed by Google/Facebook/OpenAI/Xsomething/...) or are they now better, so that they can now adapt without having to be fully retrained using the full set of (new/up-to-date) training data? Asking because I lost track of the progress in this area during the last 20 years and especially recently I understand nothing involving all new names (e.g. "llama", etc...). Thanks :) |
|
It's widely used, you can look it up.
A more challenging idea is whether it is possible to reuse the pretrained weights when training a network with a different architecture (maybe a bigger transformer with more heads, or something).
AFAIK this is not common practice, if you change the architecture you have to retrain from scratch. But given the cost of these trainings, I wouldn't be surprised if OpenAI&co had developed some technique to do this, eg across GPT versions..