|
|
|
|
|
by hansvm
529 days ago
|
|
Mostly unrelated (I agree with you, and I'm some ancestory comment you're responding to with the same line of thinking), I have built a couple LLMs where the distribution itself is stochastic. That's not key to how they work as a black box, but much like how quicksort has certain performance characteristics I did find it advantageous to introduce randomness into the model itself. You could still easily model the next token as a conditional probability distribution though if you wanted; the computation of entropy just might be a bit spendier. |
|