Hacker News new | ask | show | jobs
by hackerlight 818 days ago
It helps, but it still gets stuck in local optima based on what it started with. I've never seen it turn around and correct its faulty reasoning unless it tried to actually run the code and observed an Exception. If I respond with "but have you considered XYZ?", my leading question will usually cause it to correct itself, even when it wasn't incorrect.

We need some way to generate multiple independent thoughts in parallel. Each separate thought is constructed using chain of thought to improve the reliability. Then you have some way to "reduce" these multiple thoughts into a single solution. The analogy would be a human brainstorming session where we try to attack the same problem from multiple angles and we try to decorrelate each idea/approach.

1 comments

We already have that, it's called beam decoding, and there are three of thought solutions as well, for each beam you can pick the one with the best logprob, but it's not a given that the result will be better because logprob only capture the model decisiveness not correctness, so it'll still fail if a model is confidently wrong.
I think this is different, because you could include tool use in the branches. E.g.

1. rewrite the following question in five different ways.

2. For each version of the question, write python code to do the work.

3. Look at all the outputs, write an answer