|
|
|
|
|
by seniorsassycat
246 days ago
|
|
I'm curious what effects the system prompt has - randomize a and b, maybe there's a preference for answering a, or first option.
- how do references to training data or roles affect the responses? Limiting the response to a/b/pass makes sense to measure the results, but feels like it could affect the results. What would we see with a full response then a judgement pass |
|