| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by seniorsassycat 246 days ago

I'm curious what effects the system prompt has

- randomize a and b, maybe there's a preference for answering a, or first option. - how do references to training data or roles affect the responses?

Limiting the response to a/b/pass makes sense to measure the results, but feels like it could affect the results. What would we see with a full response then a judgement pass