| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zniturah 859 days ago
	They do use Stockfish for playing thought … “To prevent some of these situations, we check whether the predicted scores for all top five moves lie above a win percentage of 99% and double-check this condition with Stockfish, and if so, use Stockfish’s top move (out of these) to have consistency in strategy across time-steps.”

2 comments

n2d4 859 days ago

The context of that sentence:

> Indecisiveness in the face of overwhelming victory

> If Stockfish detects a mate-in-k (e.g., 3 or 5) it outputs k and not a centipawn score. We map all such outputs to the maximal value bin (i.e., a win percentage of 100%). Similarly, in a very strong position, several actions may end up in the maximum value bin. Thus, across time-steps this can lead to our agent playing somewhat randomly, rather than committing to one plan that finishes the game quickly (the agent has no knowledge of its past moves). This creates the paradoxical situation that our bot, despite being in a position of overwhelming win percentage, fails to take the (virtually) guaranteed win and might draw or even end up losing since small chances of a mistake accumulate with longer games (see Figure 4). To prevent some of these situations, we check whether the predicted scores for all top five moves lie above a win percentage of 99% and double-check this condition with Stockfish, and if so, use Stockfish’s top move (out of these) to have consistency in strategy across time-steps.

Vecr 859 days ago

They should try to implement some kind of resolute agent in that case. Might be hard to do if it needs to be "not technically search" though.

paulddraper 859 days ago

But only to complete a winning position.

mtlmtlmtlmtl 858 days ago

That's a crucial part of chess that can't simply be swept under the rug. If I had won all the winning positions I've had over the years I'd be hundreds of points higher rated.

What if a human only used Stockfish in winning positions? Is it cheating? Obviously it is.

paulddraper 858 days ago

> That's a crucial part of chess that can't simply be swept under the rug.

Grandmasters very literally do it all the time.

> What if a human only used Stockfish in winning positions? Is it cheating? Obviously it is.

Yes, but this isn't that.

This is a computer that is playing chess. And FYI (usually) without search.

billforsternz 859 days ago

The process of converting a completely winning position (typically one with a large material advantage) is a phase change relative to normal play which is the struggle to achieve such a position. In other words you are doing something different at that point. For example, me as weak FIDE CM (Candidate Master) could not compete with a top grandmaster in a game of chess, but I could finish off a trivial win.

Edit: Recently I brought some ancient (1978) chess software back to life https://github.com/billforsternz/retro-sargon. These two phases of chess, basically two different games, were quite noticeable with that program, which is chess software stripped back to the bone. Sargon 1978 could play decently well, but it absolutely did not have the technique to convert winning positions (because this is different challenge to regular chess). For example, it could not in general mate with rook (or even queen) and king against bare king. The technique of squeezing the enemy king into a progressively smaller box was unknown to it.

zniturah 859 days ago

That 'only' usage in the winning position could be a decisive for gaining GM rating.

paulddraper 859 days ago

Positions with 99% win percentage are not decisive for GM vs non-GM rating.

mtlmtlmtlmtl 858 days ago

From the paper:

If Stockfish detects a mate-in-k (e.g., 3 or 5) it outputs k and not a centipawn score. We map all such outputs to the maximal value bin (i.e., a win percentage of 100%). Similarly, in a very strong position, several actions may end up in the maximum value bin. Thus, across time-steps this can lead to our agent playing somewhat randomly, rather than committing to one plan that finishes the game quickly (the agent has no knowledge of its past moves). This creates the paradoxical situation that our bot, despite being in a position of overwhelming win percentage, fails to take the (virtually) guaranteed win and might draw or even end up losing since small chances of a mistake accumulate with longer games (see Figure 4). To prevent some of these situations, we check whether the predicted scores for all top five moves lie above a win percentage of 99% and double-check this condition with Stockfish, and if so, use Stockfish’s top move (out of these) to have consistency in strategy across time-steps.

So they freely admit that their thing will draw or even lose in these positions. It's not merely making the win a little cleaner.

paulddraper 858 days ago

> So they freely admit that their thing will draw or even lose in these positions.

Yeah, they didn't use Stockfish for the lols.

They create a search-less engine for chess. And then used a search engine to pay a small minority of the game.

mtlmtlmtlmtl 858 days ago

Yes. So how is this irrelevant for qualifying as GM-level play then? Being able to play these positions is a clear prerequisite for even being in the ballpark of GM strength. If you regularly choke in completely winning endgames, you'll never get there.

This is cheating, plain and simple. It would never fly in human play or competitive computer play. And it's most definitely disingenuous research. They made an engine, it plays a certain level, and then they augment it with preexisting software they didn't even write themselves to beef up their claims about it.

Someone 859 days ago

They are once your opponents know you’re very bad at converting them.

zniturah 859 days ago

Proof?

For winning any game at some point (at the end of the game) there will be a position with >99% winning chances. The move that follows are decisive.

littlestymaar 859 days ago

That's not how chess works. The move that follow aren't usually decisive unless you don't know how to play the game and make enormous mistakes.

Anyone that knows how to play can beat a GM with a big enough advantage at the end of the game (which is what's reflected in the win probability).

famouswaffles 859 days ago

Search isn't used to play/win here. Just for training.

cool_dude85 859 days ago

It looks like it does use search here in the sense that Stockfish's top move is generated using search.

phoe-krk 859 days ago

From the abstract:

> We annotate each board in the dataset with action-values provided by the powerful Stockfish 16 engine, leading to roughly 15 billion data points.

So some of the learning data comes from Stockfish.

paulddraper 859 days ago

The original comment was "for playing."

In training, traditional search is absolutely used to score positions.

In playing, search is not used. (*Except to finish out an already-won position.)