|
|
|
|
|
by HarHarVeryFunny
656 days ago
|
|
No amount of training would cause a fly brain to be able to do what an octopus or bird brain can, or to model their behavioral generating process. No amount of training will cause a transformer to magically sprout feedback paths or internal memory, or an ability to alter it's own weights, etc. Architecture matters. The best you can hope for an LLM is that training will converge on the best LLM generating process it can be, which can be great for in-distribution prediction, but lousy for novel reasoning tasks beyond the capability of the architecture. |
|
Go back a few evolutionary steps and sure you can. Most ANN architectures basically have relatively little to no biases baked in and the Transformer might be the most blank slate we've built yet.
>No amount of training will cause a transformer to magically sprout feedback paths or internal memory, or an ability to alter it's own weights, etc.
A transformer can perform any computation it likes in a forward pass and you can arbitrarily increase inference compute time with the token length. Feedback paths? Sure. Compute inefficient? Perhaps. Some extra programming around the Model to facilitate this ? Maybe but the architecture certainly isn't stopping you.
Even if it couldn't, limited =/ trivial. The Human Brain is not Turing complete.
Internal Memory ? Did you miss the memo ? Recurrency is overrated. Attention is all you need.
That said, there are already state keeping language model architectures around.
Altering weights ? Can a transformer continuously train ? Sure. It's not really compute efficient but architecture certainly doesn't prohibit it.
>Architecture matters
Compute Efficiency? Sure. What it is capable of learning? Not so much