| HN Mirror

Interesting. I'll check out the paper.

It's astoundingly less efficient right? How many compares ( and LLM calls ) to rank 10 items in order? And is it actually stable? You could get a ranking with logprobs in one llm call for 10 items, or do it n=3 times, with a shuffled order and average them out. I'm not sure how to scale to larger sizes of items though.

I guess it depends on how many items you are sorting, but when I think about sorting I think about putting 100+ items in order.