|
|
|
|
|
by 6gvONxR4sf7o
784 days ago
|
|
Nope, because all of that is taken care of by the mechanisms for evaluating the model. Strictly speaking, the model outputs a probability distribution. The question is why that distribution doesn’t match the instructions. |
|
1) Prompt 1: “ You are a weighted random choice generator. About 80% of the time please say ‘left’ and about 20% of the time say ‘right’. Simply reply with left or right. Do not say anything else" ”
2) Assume that the training data gives examples of 2.1) single coin flips 2.2) multiple coin flips
Consider a slightly different prompt, prompt 2:
3) Prompt 2: same as prompt 1, except it presents 1000 lefts/rights in the same response (l,l,l,l,r,l,l,l…)
——
I think what you are describing is prompt 2. I just did a quick test with GPT 4, and i got a 27-3, split when using prompt 2.
However for prompt 1 - you get only left. To me this makes sense because Running prompt 1 x100 should result in:
Pass 1: LLM receives prompt, and parses it. LLM predicts the next token. The next token should be left. Pass 2: same as pass 1.
——
For prompt 1, Every prompt submission is a tabula rasa. So it will correctly say left, which is the correct answer for the active universe of valid prompt responses according to the model.
Unless i am reading you wrong and you are saying the model is actually acting as a weighted coin flip.
In theory, the LLM should be more responsive if you ask it follow a 60:40 or 50:50 split for pass 1. Ill see if I can test this later.
(Heck now I’m more concerned about the cases where it does manage to apply the distribution. )