|
|
|
|
|
by coder543
911 days ago
|
|
Why filter out the votes made after only one or two prompts? A lot of times, a single response is all you need to see. Do you really need more than this to know which one you’re going to pick? https://i.imgur.com/En37EJD.png Avatar doesn’t have humans? Seriously? |
|
Your test isn't checking for instructions, consistency, logic, just one fact which the model you chose may have gotten right by chance. It's fine assuming you only expect the model to fact check and you don't plan to have a conversation, but if you want more than that, it doesn't work very well.
I'm hoping there are votes in there which can reflect those qualities and filtering by conversation length seems like the easiest way to improve the vote quality a bit.