| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bombadilo 761 days ago
	I mean, in this context I agree. But most people doing math in high school or university are graded on their working of a problem, with the final result usually equating to a small proportion of the total marks received.

3 comments

giaour 761 days ago

This depends on the grader and the context. Outside of an academic setting, sometimes being close to the right answer is better than nothing, and sometimes it is much worse. You can expect a human to understand which contexts require absolute precision and which do not, but that seems like a stretch for an LLM.

link

phatfish 761 days ago

LLMs being confidently incorrect until they are challenged is a bad trait. At least they have a system prompt to tell them to be polite about it.

Most people learn to avoid that person that is wrong/has bad judgment and is arrogant about it.

link

ifwinterco 761 days ago

I think current LLMs suffer from something similar to the Dunning-Kruger effect when it comes to reasoning - in order to judge correctly that you don't understand something, you first need to understand it at least a bit.

Not only do LLMs not know some things, they don't know that they don't know because of a lack of true reasoning ability, so they inevitably end up like Peter Zeihan, confidently spouting nonsense

link

perfobotto 761 days ago

This is supposed to be a product , not a research artifact.

link

chongli 761 days ago

But most people doing math in high school or university are graded on their working of a problem, with the final result usually equating to a small proportion of the total marks received

That heavily depends on the individual grader/instructor. A good grader will take into account the amount of progress toward the solution. Restating trivial facts of the problem (in slightly different ways) or pursuing an invalid solution to a dead end should not be awarded any marks.

link

slushy-chivalry 761 days ago

it choked because it didn't solve for `t` at the end

impressive attempt though, it used number of wraps which I found quite clever

link