Hacker News new | ask | show | jobs
by fauigerzigerk 3693 days ago
>Replicating functioning of the brain, or some major subsystem of it, is no doubt going to require far more than just billions of parameters.

Maybe, but we shouldn't forget that computers do not suddenly lose their capability to function as exact, deterministic, programmable machines just because they happen to run an ANN.

What I mean is that there may be shortcuts to reduce the number of required nodes dramatically.

If you take the state of an ANN after it was trained to perform some specific task, you can ask the question whether there is a simpler function, i.e. one with much fewer parameters, that approximates the learned function.

Sort of like a human with the Occam's razor gene. I think the fact that the number of neurons does not correlate perfectly with intelligence in animals is an indication that there is room for optimization.

1 comments

Absolutely 100% agree, but at the same time, I think we will ultimately need to build and evaluate models that can span the memory of more than one processor. I don't think a single GTX Titan X, GTX 1080 or even a server is enough here.

Additionally, data parallelization and ASGD broadly disallow these larger models (yes I know about send/receive nodes in TensorFlow, but they're not general or automatic enough for researchers IMO) while ASGD makes horribly inefficient use of the very limited bandwidth between processors. All IMO of course. There are hacks and tricks here, but I think those should be late stage optimizations, not requirements to achieve scaling.

Finally, I'm a stickler for deterministic computation as someone who spent a decade writing graphics drivers before joining the CUDA team in 2006, but that's pretty much a "hear me now, believe me later" opinion of mine after tracking down too many bizarro race conditions late into the night in that former life :-). Of course, one person's race condition can sometimes be an ANN's regularizer, but I digress.

I also agree we'll do some amazing things with far fewer neurons and weights than an actual human brain, but I'll bet you good money we end up needing more than 12GB to do it. AlphaGo alone was 200+ GPUs, right?