Hacker News new | ask | show | jobs
by voiper1 1018 days ago
They want it to return a single token yes/no, which may not work so well since it doesn't have "space to think". Chain of thought is much more reliable.

But that costs more.. but they ended up anyway doing: >The other key will be 'reason' and include a free text explanation of why you chose Yes or No.

But they did yes/no FIRST, then reason. So he ended up asking for the answer, and then asked it to _justify_ why that's the answer. For chain of thought to be helpful, you do the opposite: First explain why these addresses match or don't match, then give a final answer. Same amount of tokens but activated chain of thought prior to the answer, giving it "space to think".

1 comments

This exactly.

When prompted to complete "The moon is made of ", GPT3.5 returns "cheese" or "green cheese" > 52% of the time.[1]

This article suggests a method that will be statistically right most of the time, and confidently wrong the rest of it.

[1]: https://www.joshka.net/2023/06/cheese