Hacker News new | ask | show | jobs
by intended 781 days ago
Just to check we are on the same page -

The output is the probability that x is the correct n+1 token based on the input of n tokens.

You are stating that the output will be a probability distribution where token n+1 has a chance to be 80% left and 20% right.

In essence, when the model evaluates the input, at some level it comprehends the semantics of the input and then does a weighted coin flip.

What I am stating is that based on the given input prompt, the nature of an LLM and the training data the output will be "Left"

The LLM will not be doing a coin flip at this stage, since it’s prediction is only text based.

The input vector constrains it to 80% left. Since it’s training data is human text, this essentially constrains the first output token to left 100% of the time.

If you try to have it provide tokens n+2,n+3… etc in the same output, then it will start spitting out right.

Are these the two positions at play here? Have I represented you correctly, and have I represented myself accurately?