| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by hoosieree 623 days ago

My classifier is not very accurate:

    is_trick(question)  # 50% accurate

To make the client happy, I improved it:

    is_trick(question, label)  # 100% accurate

But the client still isn't happy because if they already knew the label they wouldn't need the classifier!

...

If ChatGPT had "sense" your extra prompt should do nothing. The fact that adding the prompt changes the output should be a clue that nobody should ever trust an LLM anywhere correctness matters.

[edit]

I also tried the original question but followed-up with "is it possible that the doctor is the boy's father?"

ChatGPT said:

Yes, it's possible for the doctor to be the boy's father if there's a scenario where the boy has two fathers, such as being raised by a same-sex couple or having a biological father and a stepfather. The riddle primarily highlights the assumption about gender roles, but there are certainly other family dynamics that could make the statement true.

1 comments

PoignardAzur 623 days ago

It's not like GP gave task-specific advice in their example. They just said "think carefully about this".

If it's all it takes, then maybe the problem isn't a lack of capabilities but a tendency to not surface them.

link

hoosieree 619 days ago

The main point I was trying to make is that adding the prompt "think carefully" moves the model toward the "riddle" vector space, which means it will draw tokens from there instead of the original space.

And I doubt there are any such hidden capabilities because if there were it would be valuable to OpenAI to surface them (e.g. by adding "think carefully" to the default/system prompt). Since adding "think carefully" changes the output significantly, it's safe to assume this is not part of the default prompt. Perhaps because adding it is not helpful to average queries.

link