LLMs will fail this simple eval every time

Y	Hacker News new \| ask \| show \| jobs

	LLMs will fail this simple eval every time (chatgpt.com)
	3 points by ifuknowuknow 731 days ago

3 comments

Jimmc414 731 days ago

What am I missing, it responded with specific examples of what it didn’t know

link

nabla9 731 days ago

For example:

>The exact number of grains of sand on a specific, untouched beach.

Just saying "specific" is not being specific.

"The exact number of grains of sand on a Source d’Argent beach on Seychelles." Would be specific and exact thing of knowledge it does not have. It fails to name exact, specific thing.

It's like it is compelled to be unspecific and not exact and be generic.

"any given town.", "undiscovered ancient manuscript", "hypothetical alien civilization's technology", "of a specific person","random individual", "unidentified and uncatalogued species"

link

mtmail 731 days ago

Can you explain the failure?

link

ifuknowuknow 731 days ago

1) can't tell any one specific thing it doesn't know

2) when it does tell a specific thing it doesn't know, it's a thing it actually knows well

this train of conversation looks monkeypatched by openai bc it reveals a massive vulnerability in reasoning, but is consistently reproducible across all llms

link

ifuknowuknow 731 days ago

we know what we don't know :)

link