| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by CuriouslyC 802 days ago
	In the research literature, this process is done not by "agent" voting but by taking a similarity score between answers, and choosing the answer that is most representative. Another approach is to use multiple agents to generate a distribution over predictions, in sort of like bayesian estimation.

3 comments

mritchie712 801 days ago

for my use case (generating an interesting H1), using a similarity score would defeat the purpose.

In this approach, I'm looking for the diamond in the rough. It's often dissimilar from the others. With this approach, the diamond can still get a high number of votes.

link

CuriouslyC 801 days ago

That approach definitely has promise. I would have agents rate answers and take the highest rated rather than vote for them though, since you're losing information about ranking and preference gradients with n choose 1. Also, you can do that whole process in one prompt, in case you're re-prompting currently, it's cheaper to batch it up.

link

infecto 801 days ago

For clarification on the first part. The research suggests you can utilize the same prompt over multiple runs as the input to picking the answer.

link

mistermann 801 days ago

Any chance you could expand on both of these, even enough to assist in digging deeper into them? TIA.

link

CuriouslyC 801 days ago

The TLDR is you can prompt the LLM to take different perspectives than its default, then combine those. If the LLM is estimating a number, the different perspectives give you a distribution over the truth, which shows you the range of biases and the most likely true answer (given wisdom of the crowd). If the LLM is generating non-quantifiable output, you can find the "average" of the answers (using embeddings or other methods) and select that one.

link

mistermann 801 days ago

Ah ok, so both are implemented via a call(s) to the LLM, as opposed to a standard algorithmic approach?

link

CuriouslyC 801 days ago

Once you have bayesian prior distributions (which it makes total sense for llms to estimate) you can do tons of nifty statistical techniques. It's only the bottom layer of the analysis stack that's LLM generated.

link