| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simianwords 110 days ago
	Let’s ask in good faith. Can you suggest something that it can’t do? Functional things. I’ll reply in good faith and consider it.

2 comments

seanhunter 110 days ago

Say I suggest something : Play a valid game of chess at club level (elo approx 1200 say) using algebraic notation.

Then you’re either going to say it can or you’re going to say that requires more than 10000 tokens.

This isn’t an interesting conversation and I don’t think you are presenting this challenge in good faith for the reason I gave above.

link

simianwords 110 days ago

https://chessbenchllm.onrender.com

There are several models with greater than 1200 elo

Also https://dubesor.de/chess/chess-leaderboard

link

psvv 110 days ago

I'll admit that's better than I expected, but these ratings also imply there are plenty of humans who will beat LLMs at chess.

link

stanford_labrat 110 days ago

every few months i like to ask chatgpt to do the "thinking" part of my job (scientist) and see how the responses stack up.

at the beginning 2022 it was useless because the output was garbage (hallucinations and fake data).

nowadays its still useless, but for different reasons. it just regurgitates things already known and published and is unable to come up with novel hypotheses and mechanisms and how to test them. which makes sense, for how i understand LLMs operate.

link

doomslayer999 110 days ago

I am also a scientist and had the same conclusion. I just use it to summarize papers, occasionally write boilerplate, and sometimes do some google search primitives if its an easy question.

link

simianwords 110 days ago

It is used in pure math research already

link

stanford_labrat 110 days ago

sadly it looks like seanhunter was correct, shame.

link

simianwords 110 days ago

He was literally wrong about chess

link

seanhunter 110 days ago

I said “say I said they couldn’t play chess, you will say they can” and you did. That’s literally not wrong.

link