| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vgo96 327 days ago
	I think you are discrediting LLMs, gemini 2.5 pro catches most of the flaws in the author's article. I think the author just doesn't understand floating point.

3 comments

withinboredom 327 days ago

So does ChatGPT 4.1: https://chatgpt.com/share/6888d177-4ebc-8013-b3a2-2648ebea91...

link

cmrx64 327 days ago

to properly plumb the LLM here you should also freshly ask it “Tell me what is right with this:”

I prompted them without anything except the content and they autonomously decide it’s either all nonsense or they take the bait and start praising it.

link

kragen 327 days ago

How do you know if Gemini caught the flaws you didn't notice?

link

cmrx64 327 days ago

possibly so. I’m even seeing GPT 4.1-mini ripping it apart when prompted with only the content. DeepSeek (not with thinking) is fooled.

link