Hacker News new | ask | show | jobs
by esafak 337 days ago
I would have titled it "Improving ranking..."

I like that it works with `sentence_transformers`

2 comments

We could change the title to "Improving search ranking with chess Elo scores". Anybody object?

Edit: ok, done. Submitted title was "Show HN: Improving RAG with chess Elo scores".

They don't use Elo scores. See my comment above, the loss function is adopted from Bradley-Terry.
Bradley-Terry and Elo scores are equivalent mathematical models! The fundamental presumption is the same Thurstone model - that an individual's skill in a particular game is a normally distributed random variable around their fundamental skill.

We did experiment with a Bradley-Terry loss function (https://hackmd.io/eOwlF7O_Q1K4hj7WZcYFiw), but we found that even better was to calculate Elo scores, do cross-query bias adjustment, and then MSE loss to predict the Elo score itself.

->Bradley-Terry and Elo scores are equivalent mathematical models! No, they are not equivalent mathematical models, they are equalivant in terms of calculation of score function(logistic) given equivalent scale factors. Such that, Bradley-terry: 1/(1 + e^(x(r_B - r_A))) and Elo rating: 1/(1 + 10^((r_B - r_A)/y)), then equivalance requires x = ln(10)/y. More importantly, Elo rating is online scoring system, meaning it takes into accoun the sequence of the events. From your blog post, I understand that you are not updating the scores after after each event. In other words, Elo rating can be interpreted as an incremental fitting of a Bradley-Terry (using similar logistic) model but not the same!

-> The fundamental presumption is the same Thurstone model The Thurstone model is similar, and as you said it assumes normal (as opposed to logistic) using probit link function. It predates both models and due to computational constraints, you can call Bradley-Terry and Elo rating computationally convenient approximation of the Thurstone model.

-> We did experiment with a Bradley-Terry loss function (https://hackmd.io/eOwlF7O_Q1K4hj7WZcYFiw) The math is correct. Thanks for sharing. Indeed, if you do it with incremental updating, you will lose the differentiability given the next winning probability is dependent on the previous updates. Call it what you want, but note that this is not truly and Elo rating which leads misunderstanding. It is Bradley-Terry given you do batch updates which you take extra steps to connect with Elo score, as shown in the link.

Lastly, normal and logistic distribution will lead to log(0) in evaluations which results inf in loss. As I can see from you upper comment, you try add uniform(0.02) as ad-hoc fix. An elegant fix to that is use heavy-tailed distribution such as Cauchy.

yes we found it hard to find a good title for this, thanks for the feedback