| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by markers 1364 days ago
	In case you haven't seen it, some new evidence surfaced yesterday: https://youtu.be/jfPzUgzrOcQ

10 comments

Tenoke 1364 days ago

In one of those games, Nieman was losing by 1.3 points in the first 10 moves despite being 100% according to this analysis so I'll take it with a grain of salt. This only looks at having moves within the top 3 engine moves done by one of the engines tested, and sometimes there's just 1-2 good moves so doing 1 out of 10 (or whatever) possible moves doesn't mean you did anything good. Further, it's unclear how cherry-picked it is. If it was that obvious I'd think the other analysis would've caught it which they didn't.

You can find more discussion of it on reddit, but the threads are generally all over the place.

https://www.reddit.com/r/chess/comments/xofl99/one_of_the_10...

Etheryte 1364 days ago

A 1.3 difference out of an opening sideline is neither rare nor lost, and mostly simply comes down to the fact that the engine doesn't understand the opening. Even the mod pin in the link you posted clearly outlines that this is a misleading way to frame this. It would be better to read more into the discussion before helping misinformation spread.

Tenoke 1364 days ago

I didn't describe it as lost like the poster did, I said 'losing by 1.3' which is accurate. At any rate, if you actually did read deeper you'd see that losing by 1.3 is plenty relevant, when claiming 100% engine play correlation, and that the mod is somewhat cherry-picking. Further, playing openings where you are -1.3 in 2000 blitz is fine, but in super GM games 1.3 points down is more often than not pretty bad.

zone411 1364 days ago

"The engine doesn't understand the opening" might have applied 10 years ago but you'll have a very hard time finding a single opening in all of chess where it would be the case now.

roflyear 1364 days ago

1.3 is evaluated as more than a pawn, which is significant.

jasonwatkinspdx 1364 days ago

This analysis has since been retracted after a wide variety of people explained to her that she was misunderstanding fundamentals of how to do this.

The discussion on /r/chess is pretty good.

boole1854 1364 days ago

While interesting, this does not seem very convincing to me. They successfully show that Niemann was playing many games with a high percentage of "engine perfect" moves, but they do not do enough to show that this is inconsistent with what top players usually do. A whole distribution of scores is shown for Niemann but only limited summary statistics are shown for other top players. A proper comparison would involve showing the same type of data for both.

VanillaCafe 1364 days ago

> They successfully show that Niemann was playing many games with a high percentage of "engine perfect" moves, but they do not do enough to show that this is inconsistent with what top players usually do.

I thought the video very much did make that case. A single known cheating game had a 98% correlation (Sebastien Feller Paris 2010), other GMs have generally at most 75% average correlation. The analysis had more than half a dozen games with Niemann at 100% correlation. If that's cherry picking, it seems like there are a lot of cherries to pick.

roflyear 1364 days ago

Yah and Hikaru was able to find games of his that were 100% too, fairly quickly: https://clips.twitch.tv/FaintCuteKumquatPhilosoraptor-hDvbAj...

usgroup 1364 days ago

Well she shows 6+ games where he has 100% correlation with the engine. What are the chances?

aqme28 1364 days ago

> What are the chances?

We don't know! That's why this is an incomplete analysis. A comparison against other players of his caliber would answer that question.

usgroup 1362 days ago

Hikaru tried to answer just that:

https://www.youtube.com/watch?v=qjtbXxA8Fcc

boole1854 1364 days ago

Yes, that's the question that I wish she had tried to answer. What are the chances? Without checking for that pattern by other 2600+ GMs, we don't know the answer.

goatlover 1364 days ago

Hikaru, who ranges from high 2700s to 2800s, said only one of his games is at 100%, which was his best game. Hans having multiple 100% is suspicious.

8note 1364 days ago

How long was the engine run for those cases? Does the engine output change if you double or halve the depth?

100% correlation to an output that can be tuned doesn't seem that exciting

johncessna 1364 days ago

And then explains the odds, to both this one and the parent's question

__s 1364 days ago

https://imgur.com/a/JpJsMyI

glumreaper 1363 days ago

OMG look at the Y axis scale of the graphs. Everyone else is 0-1000+ (often 0-2000), Magnus Carlsen's graph scale is 0-200.

__s 1363 days ago

Yeah, because Carlsen's is only a sample of 426 games

The first 4 are the most interesting, having same sample size of 4000. But across the board players tend to have little distinction between choosing moves between 0.0 & 0.1, except one player

roflyear 1364 days ago

That's hardly evidence.

Me, as a 1600 player, have played some 0-0-0 games on Lichess. I didn't cheat. I just play a lot of chess games and during those games, my opponent was really bad, so I had a perfect game (according to the engine).

mikenew 1364 days ago

You're conflating accuracy with engine correlation. Having a perfectly accurate game means you didn't make any moves that caused a centipawn loss. Having 100% engine correlation means you're making the exact moves the engine would make.

dak1 1364 days ago

I think I have on 3-4 occasions played a game where, after evaluating on chess.com, got a 100% accuracy (which is engine correlation). A couple times were all theory and then blundering a mate in 1, but...

I did have one game where I didn't know the theory except a very vague recollection in the beginning. I actually thought I had blundered in that game and was trying to figure out what I'd do if my opponent made a certain move — they didn't find it, I ended up winning material in a tactic and they resigned — I was in complete shock when it came back 100% accuracy (and I definitely did not see the engine response to the move I was worried about, which was the best move).

I'm only around 1600-1700 on chess.com.

Not taking a position either way on Hans, but I have no doubt he knows far more theory than I do (and I do know some lines 20+ moves deep), and correlating with an engine is not impossible even outside of book.

zarzavat 1364 days ago

To repeat what was said above, accuracy is not the same as engine correlation.

Engines often play moves that are counterintuitive and weird, but nonetheless good. This is because they can evaluate large trees of tactics in a way that humans cannot.

If a human finds a natural move that is just as good as the engine move (in terms of evaluation), they are still playing accurately, but they are uncorrelated with the engine. Playing accurately is not a sign of cheating. Playing many engine moves is a sign of cheating.

roflyear 1364 days ago

Those games don't have 100% engine correlation, either. The entire video is a mess.

drexlspivey 1364 days ago

The engine scores centipawn loss against the perfect move (according to the engine). The engine plays the move with the lowest centipawn loss itself. How are those two different?

lupire 1363 days ago

If there are multiple good moves, they all count as accurate.

boole1854 1364 days ago

So there are "really bad" opponents at the 1600 level, but is it reasonable to think there are "really bad" opponents at the 2600 level? It's a different world up there.

iends 1364 days ago

Right, op is making a mistake in thinking that a perfect game against a 1600 is the same as a perfect game against a GM. GMs will intentionally play less perfect moves to head towards complications where they will come out ahead. When I start to bang out 15 moves of theory against a IM/GM they will recognize it and play something I’m not familiar with and just win more quickly.

roflyear 1364 days ago

Those games were against opponents 100-200pts lower rated than Hans, in some cases.

joshuamorton 1364 days ago

You, a chesscom 1600, played a (or multiple) perfect game against a chesscom 2600 and/or strong IM?

Link?

roflyear 1363 days ago

No. I never said that.

sudosysgen 1364 days ago

This is complete garbage. Real statistical analysis has been done, and has been inconclusive so far. Cherry picking games is ridiculous - at the 2800 level, a GM will only deviate from the engine's top moves 0-3 times. It would be expected that an exceptional performance would remain within top engine moves if someone was able to play at that level.

goatlover 1364 days ago

Hans isn't at the 2800 level, but Magnus is almost 2900, and probably has good reason to be suspicious of someone playing way above their rating in tough matches.

sudosysgen 1364 days ago

If Hans was to beat Magnus, he would have to play at least at the 2800 level. At which a game with only top moves is quite frequent.

asvitkine 1363 days ago

Source for "quite frequent"?

usgroup 1364 days ago

Yeah this is interesting … she shows that Hans had many tournaments where he shows record setting move accuracy as measured by correlation to Stockfish 15, and that 6 of these tournaments occurred in a row. She also shows that for those tournaments even by Reagans model, Hans results would be like a 1/70000 chance if legit.

t_mann 1363 days ago

The key point (in the end) seems to be that the odds for a streak like Niemann had are about 1 in 80k. Statistically speaking, I'd say that's a long shot from a smoking gun. Here's a good rundown of a case where a cheater is considered to have been exposed by statistical evidence, but those odds were on a completely different scale, 10^22:

https://www.youtube.com/watch?v=8Ko3TdPy0TU

Basically the mistake that is easy to make is that we shouldn't ask: "what is the probability that Hans plays five tournaments like that in a row?", but "what is the probability that someone will play five tournaments like that in a row?". Even if we correct for the fact that there are probably more Minecraft speedruns happening than GM tournaments, odds of 80k just seem a bit too low to call it evidence.

bombcar 1364 days ago

This seems interesting, any chess-knowing people willing to take the hit for the team and watch all 23 minutes?

tgtweak 1364 days ago

Using an ultra-high ELO chess engine to score each possible move, then reversing through the players moves and seeing how often it would have been a positive move (one that shifts the balance of the game in your favor) - or perfect move (not sure which). It is extremely rare to make 100% perfect moves in a game, let alone a series of games. Typical gameplay for high level chess player doesn't peak over 72-75% for a given series of N games. Niemann has several tournaments over this and several games with 100% perfect moves. The inconsistency is also a concern since he goes from mid-60's to 78/79 in a span of one tournament.

His games against Magnus were exceedingly high.

activitypea 1364 days ago

It's also worth pointing out that a player's odds of making the perfect move are inverse to their opponent's ELO: as the level of play rises, finding the right play becomes exponentially harder. The data suggests he's sometimes playing other grandmasters as good as those grandmasters would play a rando on Lichess.

lupire 1363 days ago

It's Elo, not an acronym.

roflyear 1364 days ago

It's not extremely rare. Stop pulling things out of your butt.

THE FIRST GAME Hikaru opened when he tried to check his games was 100%. He opened a random fucking game!

amflare 1364 days ago

GP is not pulling this out of their butt, they are summarizing the video like GGP requested.

Also, your anecdote doesn't prove anything.

User23 1364 days ago

If someone wins the powerball the first time they buy a ticket it’s still a rare event.

roflyear 1363 days ago

yeah - if I pulled a random willing powerball ticket out of my massive pile of powerball tickets, that would be a really, really rare event. It would make me believe that it isn't such a rare thing, for sure.

primitivesuave 1364 days ago

FM Yosha puts forward a fairly convincing argument about odds and engine correlation, but another commenter rightly pointed out that these statistics are not seen as incriminating in and of themselves. Unfortunately, even when the preponderance of evidence seems to be against a player - best example is Sebastian Feller (https://en.wikipedia.org/wiki/S%C3%A9bastien_Feller) playing with superhuman accuracy at crucial moments, and whose team captain later admitted to helping him cheat - they can still cast enough doubt to be allowed to continue playing at the highest level.

Here is a blunder that Feller played on move 13 just over a month ago (https://new.chess24.com/wall/news/grandmaster-blunders-mate-...) - this same guy managed to draw against Magnus Carlsen in 2008, in a game where Carlsen also found the moves/mannerisms of his opponent highly unusual.

roflyear 1364 days ago

It's been talked to death. The consensus is you cannot just cherry-pick some games and claim he's cheating.

Everyone has games that are perfect. Everyone. Not just GMs or Super GMs. I have at least a few perfect games and I'm half the rating Hans is.

The games analyzed also have crazy blunders by his opponents.

goatlover 1364 days ago

Perfect when compared to the moves the top chess engines would make? Hikaru says he only scored 100% one time, and 70% is more typical for a GM, yet everyone does it?

roflyear 1363 days ago

First, it isn't THE top moves. It is ONE of the top moves. Huge difference.

Where does Hikaru say he only has a 100% correlation game one time? I've seen lots of examples of other players having such games.

tpoacher 1364 days ago

the gist seems to be that he has unrealistically high correlation with game-engine recommendations, often all the way up to 100%, but only when playing "tough" opponents, and far lower / realistic correlation scores (around 50%) in other games.

for reference, magnus carlsen's correlation score at his peak averages around 70% (according to the video)

peterhunt 1364 days ago

Ai written summary https://www.summarize.tech/youtu.be/jfPzUgzrOcQ

freediver 1364 days ago

Cool! is this something you are working on?

peterhunt 1363 days ago

Yes! I built this for exactly this use case :)

dematz 1363 days ago

This is super cool! I tried it on a video I recorded a while ago that I completely forgot about and was like wait wow the summary came away with points that I'd be glad a viewer got.

One thing is it was a very long and rambling video and probably didn't do a great job of motivating examples rather than just getting bogged down in them for a while, so the summary doesn't really say how the examples support the central claim, but that may be the fault of the video honestly lol...

Also a few basic errors like writing "medium" where I'm pretty sure I said or at least meant "median" and in one case, I'd have to go back and watch this to be sure, but it seems like the summary says something is better in B than in A when I was saying it's better in A than B. The summary definitely touches the right content but I'm not sure it's correct.

Also funnily I have a tendency to sprinkle the word "like" liberally(for better or worse) and the summary copies some of the sentences verbatim, starting with "Like..."

(completely off the topic of cheating in chess, sorry...)

Invictus0 1364 days ago

He played several games with 100% correlation with what chess engines considered to be the best move, and also played in 5 consecutive tournaments with such a high fraction of engine-preferred moves that his performance rivals the best players in history at the pinnacle of their careers.

roflyear 1364 days ago

Yes, and so does.... everyone else that is 2600+. And lots of people who aren't.

Bud 1364 days ago

That's simply a lie.

roflyear 1363 days ago

It isn't a lie. Every person over 2600 will have games that 100% correlate with the engine using these methods.

I have games, as a 1400-1600 that are perfect games.

robswc 1364 days ago

It would be nice to get an ELI5 on this too. I used to play chess and have an understanding of the significance... but I don't think I can fully appreciate it as well as someone with a solid background in both.

tgtweak 1364 days ago

Hard to cheat statistics.

This is how they find accounting fraud as well.

BeetleB 1364 days ago

No - this is how they find suspects for accounting fraud. They still need to show actual proof of fraud.

drexlspivey 1364 days ago

How you suggest they do that in online tours?

Tenoke 1364 days ago

It's actually fairly easy to cheat statistics. It happens literally all the time in Academia. There's a thousand ways to make a statistical analysis believably say what you want in a way where even other professionals don't realize is the case unless an expert does a thorough analysis.

Sebb767 1364 days ago

It's easy to cheat a statistic you create. It's quite hard to cheat a statistic where you don't know who will look when at which particular data points.

Tenoke 1364 days ago

Which is exactly the case here, as they decided what data to look at and how, and possibly had a bias.

Further, clearly the analysis wasn't so irrefutable given that they admitted faults with it after others pointed out mistakes[0]

0. https://twitter.com/IglesiasYosha/status/1574308784566067201?

Sebb767 1364 days ago

I did not intend to attack or defend Hans with that statement, I just wanted to point out that both you and the original comment could be right at the same time. That being said, it's quite funny that this case showed both sides.

roflyear 1364 days ago

Yeah, but it is also extremely easy to misrepresent statistics!

TheAceOfHearts 1362 days ago

Since younger players are growing up and developing with these incredibly powerful chess engines, I wonder if that plays a role in their ability to play at such a level.

If this were the case, I think we'd see younger players more likely to get these 100s more often as they're learning from chess engines.

Does anyone familiarized with the topic know if this makes sense?