haha fair point, you can get the expected results with the right prompt, but I think it still reveals a general lack of true reasoning ability (or something)
Or it just shows that it tries to overcorrect the prompt which is generally a good idea in the most cases where the prompter is not intentionally asking a weird thing.
This happens all the time with humans. Imagine you're at a call center and get all sorts of weird descriptions of problems with a product: every human is expected to not expect the caller is an expert and actually will try to interpolate what they might mean by the weird wording they use
This happens all the time with humans. Imagine you're at a call center and get all sorts of weird descriptions of problems with a product: every human is expected to not expect the caller is an expert and actually will try to interpolate what they might mean by the weird wording they use