|
|
|
|
|
by npip99
330 days ago
|
|
Yes our pairwise method is based entirely on 2AFC comparisons, for both intra-query and inter-query ELO calculations. It's definitely the best if not only way to get extremely high signal, and a score assignment that actually converges the more you sample. In terms of the "F" in 2AFC, we actually have this amusing snippet from our prompt: > Do NOT output a score of 0.0, ensure to focus on which document is superior, and provide a negative or positive float between -1.0 and 1.0. |
|