| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by avereveard 816 days ago
	We already have that, it's called beam decoding, and there are three of thought solutions as well, for each beam you can pick the one with the best logprob, but it's not a given that the result will be better because logprob only capture the model decisiveness not correctness, so it'll still fail if a model is confidently wrong.

1 comments

I think this is different, because you could include tool use in the branches. E.g.

1. rewrite the following question in five different ways.

2. For each version of the question, write python code to do the work.

3. Look at all the outputs, write an answer