| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hyperpape 3887 days ago
	This is hard to evaluate. Explicit metrics would be good, but there's also a definitional point. Depending on who you ask, existing AI programs are as good as a very strong human player. Others will scoff at that characterization. The problem is what "very strong" means. Precisely, current programs can beat many professionals with a 4 stone handicap. Making good analogies is hard, but here's mine: current go programs are probably somewhere in the range of runners running a 2:45 marathon. They're beyond what most people can hope to reach, perhaps even with intensive training, but they're not world-class. In fact, they're not just below world-class: there are thousands of people better than them who are themselves not yet world class. (Context: I play go and run at a level far below the ones quoted above, but know people who play/run at that level. And I've seen those people get their butts kicked by stronger players/runners).

2 comments

NhanH 3887 days ago

Another easier comparison for anyone not familiar with how good 2:45 marathon is (like me). Right now current programs are around ~2500 ELO in chess, or about the weaker part of most International Grandmaster in chess.

link

hyperpape 3887 days ago

You're right that the comparison won't work for everyone. Here's another: Franz-Josef-Dickhut, currently the 54th strongest player in Europe[1] just won 3-1 against Zen, a top program. He did the same last year against CrazyStone, another top program. There are strong European players, but much fewer than in China, Japan, Korea or Taiwan. So there are thousands, perhaps even ten thousand humans still stronger than the best programs.[2]

However, the ELO figure seems high to me. A go handicap stone is probably more than 100 ELO at those levels, because high level players are good at conserving small advantages. Additionally, the players in these exhibition matches are usually themselves not the top players in the world. For instance, Zhou Junxun or Ishida Yoshio are two I recall off the top of my head.[2]

[1] http://europeangodatabase.eu/EGD/createalleuro3.php?country=...

[2] If you know how many humans can run a 2:45 marathon, let me know! I know Boston lets you in with a 3:05 time if you're 18-34, and tens of thousands of people run that (some of them are older or women, but many eligible people don't compete...). I'm kinda spitballing on the exact times, but that's roughly how I want to do the comparison.

[3] (http://www.goratings.org/).

link

hangonhn 3887 days ago

2:45 marathon would generally put you in the elite but non-professional class and give you a good shot of being in the top 10 for the smaller marathons. That's where many former collegiate runners would be able to do.

A good way to do comparison might be to do the percentile. 2:45 would put you in the 76 percentile.

http://www.heartbreakhill.org/age_graded.htm

link

kqr 3886 days ago

Another thing to note is that the professionals that can be beaten with 4 handicap are usually pretty late in their career, and do not play as well as they did when they were in their prime. Much less at the level current pros in their prime play.

link

RobertoG 3887 days ago

But, surely, they are going to improve the system, and fast.

I would bet that in a few years is going to be clear who is superior, the machine or the human.

link

tel 3887 days ago

That's been happening for a while now. My understanding is that there was a major wall many go AIs were stuck on prior to the invention of monte carlo tree search (essentially, a way of evaluating a better play objective function) and they're improving now again. There appears to be reticence to evaluate the strength of go programs, though.

Here's a chart of current ratings of popular engines on online forums. The rating system is one of "kyu" and "dan" going from 25k to 1k to 1d to 9d.

http://senseis.xmp.net/?KGSBotRatings

Informally there's another rating system the "pro" rating system which is a bit ceremonial but also broadly assumed to be often higher than the amateur dan rankings.

link

hyperpape 3887 days ago

There was a huge improvement in the period of roughly 2005 through a few years ago, at which point programs seemed to be making only slow, incremental progress. At the current rate, it looks like we're talking at least a decade to reach full professional status.

I guess Facebook is claiming they're upending that, but that remains to be seen.

link