Hacker News new | ask | show | jobs
by tc 3164 days ago
When things will start getting interesting is when we figure out how to get move simulation and search into the network itself, rather than programming that on the outside. As far as I know, no-one has even the faintest idea of how to do that. We have an existence proof that this should be possible.

The networks are great at perception and snap-prediction. Anything a human can do in 200ms is fair game. And with clever engineering, we can make magic happen by iterating or integrating those things.

But it's after that first 200ms that humans get really intelligent. When we can come up with an architecture that lets the networks themselves start simulating possibilities, backtracking, deciding when to answer now or to think more -- when the network owns the loop -- then it will get interesting.

3 comments

> We have an existence proof that this should be possible.

Not guaranteed. The human brain has diffusion signalling (i.e. neurotransmitters passing out of the synaptic cleft, into a neighbouring one, and activating a receptor on some other spacially-local axon as a result.) And one of those signalling molecules is thought to represent, in its intensity, a confidence-interval bias adjustment (i.e. a pruning bias factor for MCTS.) So the brain's MCTS-equivalent process may rely on some extra-graphical properties of the brain-as-embodied-meat-thing.

That will be a couple of additional terms in activation function. Or am I missing something?
“Neighbouring” is defined in terms of embedding in a metric space and inverse-cube diffusion, rather than anything to do with graphic connectivity.

Also, these signals pile up in the synaptic cleft until they’re picked up, so it’s not just about instantaneous transmissivity as if these were radio signals.

But also also, other stuff like monoamine oxidase is floating about in its own diffusion patterns, cleaning up these signals.

It’s basically like a “scent” communication embodied-actor model, but a very complex one where things like redox reactions with the atmosphere occur.

Oh, and there are “secondary messengers”: signals that trigger other signals that, among other things, inhibit the release of the original signal when received back at the sender, such that an dynamic equilibrium state is reached between the two signal types.

I think what you are suggestion is similar to Deep Mind's Sokoban bot: https://deepmind.com/blog/agents-imagine-and-plan/
What do you mean by move simulation?
I think he means that the NN somehow learns MCTS without it being coded in explicitly.