Hacker News new | ask | show | jobs
by Lerc 531 days ago
If it were a complete failure on variations I would be inclined to agree. Instead it was a 30% drop in performance. I would characterise that as limited understanding.
2 comments

My guess is that what’s understood isn’t various parts of solving the problem but various aspects of the expected response.

I see this more akin to a human faking their way through a conversation.

I see this more akin to a human faking their way through a conversation.

That works in English class. Try it in a math class and you'll get a much lower grade than ChatGPT will.

Fully agree with this