| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Lerc 531 days ago
	If it were a complete failure on variations I would be inclined to agree. Instead it was a 30% drop in performance. I would characterise that as limited understanding.

2 comments

My guess is that what’s understood isn’t various parts of solving the problem but various aspects of the expected response.

I see this more akin to a human faking their way through a conversation.

I see this more akin to a human faking their way through a conversation.

That works in English class. Try it in a math class and you'll get a much lower grade than ChatGPT will.

Fully agree with this