Hacker News new | ask | show | jobs
by janalsncm 597 days ago
It depends. If we use beam search we pick the most likely sequence of tokens rather than the most likely token at each point in time. This process is deterministic though.

We can also sample from the distribution, which introduces randomness. Basically, if word1 should be chosen 75% of the time and word2 25% of the time, it will do that.

The randomness you’re seeing can also be due to implementation details.

https://community.openai.com/t/a-question-on-determinism/818...