|
|
|
|
|
by turnsout
393 days ago
|
|
Yes, this was a great article. We need more of this independent research into LLM quirks & biases. It's all too easy to whip up an eval suite that looks good on the surface, without realizing that something as simple as list order can swing the results wildly. |
|