| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by trashtester 1149 days ago

That's probably true for most kinds of NN architectures, including convolutional layers and older recurring architectures (LSTM, etc). Fully connected networks do not seem to be a necessary and certainly not efficient way to represent the mechanisms that operate in the "real world", so clever way to make the networks sparse is an important key.

But it's equally important to create architectures that allow efficient backpropagation of errors.

It does seem like transformers are pretty good at both, already.

I kind of hope we're not getting much something radically better anytime soon, because it seems like AGI is already approaching faster than we can prepare for.

Then again, I would expect that someone somewhere is already using transformer based networks to develop some brand new architecture that does in fact provide such a leap.