Hacker News new | ask | show | jobs
by hjnilsson 2198 days ago
Another way of thinking about how efficient the brain is: By the article’s numbers, about 5.5 million TPU hours were required to train the machine to play as well as a Go champion.

A Go champion might have trained for 8 hours a day, for 15 years (age 5 to 20). That is about 40 000 hours.

In other words, machines required 137 times longer to learn the game, and at twice the power consumption! There is still a lot of room for improvement.

8 comments

Go champions don't learn from zero. They learn from teachers, books, and playing against each other. This knowledge is built over hundreds, or thousands of years.
Alphago didn't learn from zero either. It has a pre-processor that identifies sets of patterns with known features, and also:

"AlphaGo was initially trained to mimic human play by attempting to match the moves of expert players from recorded historical games, using a database of around 30 million moves".

That's for an earlier system (which also used less compute).

AlphaGo was followed by AlphaGo Zero (which is the topic of this article) which did not use the process that you describe, it used only the rules of the game and the winning condition.

Oops, my mistake. Thanks for the correction.
AlphaGo != AlphaGo Zero
Yes! So perhaps one way to make the machine more efficient, is by one of pre-programmed “general” models, that can be attuned to a particular problem in a much shorter time?
And long collaborative study sessions.
This comparison is not entirely fair because the human brain also benefits from priors baked in over the entire course of evolution.
That's a pretty big claim. One could argue that the topology of the brain is a prior, analogous to the architecture of a neural net. But considering that we really have no idea how learning happens in the brain on a large scale, you really can't say.
I believe that there's a reasonable (but unprovable) assumption that any games which humans usually play - because the rulesets are learnable and interesting for humans - implicitly rely on priors of human brains and behavior.

The space of possible games is huge (infinite?), but only a tiny subset of these games could reasonably become a popular game for humans.

E.g. it's not an arbitrary random coincidence that the scoring rules for each grid intersection in go are the same (I mean, it could vary in an arbitrary pattern), it ensures that the ruleset is small enough so that humans can learn it.

It's not an arbitrary random coincidence that the playing of go involves pattern recognition on some level, since that's what we're good at and find interesting in many games.

It's not an arbitrary random coincidence that in Mario game after jumping the sprite falls back down eventually; that's reusing the priors from real world physics.

Games are designed to be fun and playable by humans, this doesn't seem surprising
I don't remember the name but I definitively saw attempts to build a general AI that was designed to solve 50 different games. There was one long learning phase where the AI learns mechanics that are common to all of the 50 games and then there is a much shorter learning phase that is just specific to a single game. Same attempt was made with a Minecraft bot. First it just learned how to live and interact in a vanilla world. Then it was fed twitch pvp livestreams and finally it was placed in a pvp server. It didn't perform super well but watching the livestreams was definitively more efficient than learning combat from scratch.
The whole atari suite has been solved recently by a single algorithm: https://deepmind.com/blog/article/Agent57-Outperforming-the-...
Does anyone have an idea about the advances in research about this topic i.e. human intelligence and how learning happens in the brain?

I know Josh Tenenbaum from MIT [1] works on this, see for example :

- How to Grow a Mind: Statistics, Structure and Abstraction [2]

- Steps towards more human-like learning in machines [3]

Wondering if there are other researchers exploring similar questions.

[1] http://web.mit.edu/cocosci/josh.html

[2] https://www.youtube.com/watch?v=97MYJ7T0xXU

[3] https://www.youtube.com/watch?v=WTK6eaSVTjo

There is very little that we understand about the larger picture of how learning happens in the brain. We have some understanding of how learning happens on very small scales, I'm talking plasticity at a single synapse. But even restricting ourselves to a single synapse, there is much we don't know. At the least, it's clear that synapses and dendrites have impressive computational capacity but making detailed measurements is currently beyond the reach of our experimental apparati. We can measure signals in dendrites and synapses, but not at a high enough spatiotemporal resolution to answer the big questions.

And we're starting to bump against fundamental limits of these apparati. Most modern neurobiology uses genetically encoded fluorescent sensors read out by rather expensive 2-photon microscopes. The sensors aren't as crisp as one wishes - there is a huge subfield dedicated just to deconvolving these fluorescent sensor readings into what the neurons are actually doing. And there's only so much further the 'scopes can be pushed.

The point being: it's really quite difficult to overstate just how overwhelmingly complex the brain is and how far we are from understanding even little really specific bits of it, let alone the whole thing.

That being said, the redwood center for theoretical neuroscience does some excellent work bridging the cutting edge of theory neuro and machine learning - towards the larger picture of how the brain works. You might be surprised at how 'rudimentary' the questions we're trying to solve in that domain are. Most work focuses on the visual system - far easier to study something when you have a good idea of what it's supposed to do (as opposed to, say, cortex).

I am not aware of anything resembling a grand theory that makes experimentally verifiable predictions. I am pretty sure I would have heard of such a thing if it existed.

True, but a spider doesn't figure out how to build a web all by itself. That's to say, there is a lot that evolution can provide us with as a prior.
Yes it is clearly possible to encode behavioral priors, there are many examples from different species.

But humans aren't spiders. We've got the big brain, it's kind of our thing

Not just the topology of the brain, but the environment is also important. Human life is more diverse than that of AlphaGo, we can borrow concepts gained while doing something else. Should we count those external tasks as part of the learning to play Go?
Yeah only the last few human layers needed to be trained for the GO expansion pack, all the early layers were frozen during GO training.
OTOH, I expect that avoiding human evolutionary priors is necessary for superhuman performance.
So did the machine, albeit indirectly.
But there are also many other people spending time studying Go who didn't reach that level. We ran all that studying in parallel and then selected the best person by running a world championship. You can't only count his effort alone.
True. But that single brain, in that person was that efficient. And represents the theoretical gap in efficiency to the machine.

There are for example, other NNs also being trained to play Go, should all unsuccessful attempts be counted into the machine total? The comparison is almost impossible then.

https://github.com/lightvector/KataGo

>KataGo's latest run used about 29 GPUs, rather than thousands (like AlphaZero and ELF), first reached superhuman levels on that hardware in perhaps just three to six days, and reached strength similar to ELF in about 14 days. With minor adjustments and a few more GPUs, starting around 40 days it roughly began to match or surpass Leela Zero in some tests with different configurations, time controls, and hardware. And finally after about four months of training time, the current run may be wrapping up fairly soon, but we hope to be able to continue it or begin another run in the future.

> In other words, machines required 137 times longer to learn the game, and at twice the power consumption!

This comparison is a bit unfair. Humans are the result of evolution on a grand scale. Human Go is the result of millennia of gameplay. A human does not become grand master in isolation.

AlphaGo is the result of an evolutionary tournament style competition of a much smaller duration and breadth. AG is also a population, not just one agent, and it would be silly to take just one agent and evaluate it on its own as if it could be created without the others.

Should we include the human costs as well in AG, why just the electricity and CPU?

AlphaGo is the result of human evolution too.
Human evolution is a result of mammal evolution, and so on. The question is how to compare in a fair way?
It'd be really interesting if a research group could calculate an entropic calculation on how efficient training any given neural network would be. As in what is the thermodynamic limit of the most optimal NN training could be in terms of watts per bit trained. My hunch would be that human brains would operate close to this limit. At least in our standard environmental conditions. Based on how near optimal biomaterials are in terms of strength to weight ratios it wouldn't surprise me much.
I think the problem you'd find is that "bit trained" is probably highly non-trivial.

For example, I expect that the training required to go from 7-year-old child to Go grand master requires a completely different number of bits of information than the training required to go from blanks-late NN to NN Go Grand master. I also suspect that the difference in what is being learned may well dominate the difference in training efficiency. Both the prior knowledge and the mechanism of learning are so different that I doubt you could get a meaningful comparison based on current understanding.

You should remember that we have no idea basically how human beings actually learn things, and no idea how much prior knowledge we have encoded. Just for an example, I once saw a documentary that claimed chess grandmasters seem to recognize valid chess positions using the parts of the brain that usually recognize faces. Assuming that was true (I'm not claiming it is) perhaps a part of their chess learning consisted in taking a built-in face recognizing NN and training it to recognize chess boards. How much did the built-in knowledge of recognizing faces help? I don't think it would be possible to calculate.

Agreed, after writing that I realized that "bits" of training is a pretty poor metric. Especially in lossy NN as compared to normal computing. Likely researchers will be busy for decades defining and narrowing down the concepts in the field before useful values could be determined in terms of information theory.

A huge question I didn't even realize was "bits don't relate very directly to a NN ability to perform a task".

Can't help but remark that "7-year-old child" is not a valid go rank. Some 7 year olds are surprisingly good at playing go :)
True, I should have probably said something safer, like 1-year old child :)
Rather than Go champion I would rather use the term Go professional. There is a difference between being a professional and winning professional tournaments.

Now, the bot has many advantages. It never sleeps, never gets distracted, never dies and can be copied to another system to obtain a copy of the bot with the same playing performance.

The bot is also more accessible. Any player now can train with a bot, all day if you want, for almost free. You cannot do that with a professional.

Human does not learn Go from the scratch on himself. He's using teachers, books which present compressed knowledge which was crystallized from many millions of human hours.

If you would ask someone to learn Go, but only present him rules of the game, he'll likely be weak player (although probably with some original strategies).