|
|
|
|
|
by tracerbulletx
849 days ago
|
|
This doesn't change the point of your answer, but to add on, the result of that learned function is the probability of all tokens occurring next which is sampled when inference is happening. The type of sampling used can be different at inference time. |
|
I thought the LLM was "getting to know the user" but it had it a short memory span (the context) and thus "forgot" already calculated weights that it would use to (re)generate new weights.
Further down I learned it freaking forgets all the previous weights in general (I think that's what I learned, I'm getting there)