Hacker News new | ask | show | jobs
by extraduder_ire 403 days ago
Are there any stable output large language models? Like stablediffusion does for image diffusion models.
1 comments

If you use a deterministic sampling strategy for the next token (e.g., always output the token with the highest probability) then a traditional LLM should be deterministic on the same hardware/software stack.
Wouldn't seeding the RNG used to pick the next token be more configurable? How would changing the hardware/other software make a difference to what comes out of the model?
> Wouldn't seeding the RNG used to pick the next token be more configurable?

Sure, that would work.

> How would changing the hardware/other software make a difference to what comes out of the model?

Floating point arithmetic is not entirely consistent between different GPUs/TPUs/operating systems.

Deterministic is one thing, but stable to small perturbations in the input is another.
> Deterministic is one thing, but stable to small perturbations in the input is another.

Yes, and the one thing that was asked about was "deterministic" not "stable to small perturbations in the input.

This looks "fun" too: commit fixing a small typo -> the app broke.
So nothing's changed, then :D