| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tibbar 403 days ago
	If you use a deterministic sampling strategy for the next token (e.g., always output the token with the highest probability) then a traditional LLM should be deterministic on the same hardware/software stack.

2 comments

extraduder_ire 402 days ago

Wouldn't seeding the RNG used to pick the next token be more configurable? How would changing the hardware/other software make a difference to what comes out of the model?

link

tibbar 402 days ago

> Wouldn't seeding the RNG used to pick the next token be more configurable?

Sure, that would work.

> How would changing the hardware/other software make a difference to what comes out of the model?

Floating point arithmetic is not entirely consistent between different GPUs/TPUs/operating systems.

link

roywiggins 403 days ago

Deterministic is one thing, but stable to small perturbations in the input is another.

link

dragonwriter 403 days ago

> Deterministic is one thing, but stable to small perturbations in the input is another.

Yes, and the one thing that was asked about was "deterministic" not "stable to small perturbations in the input.

link

kokada 403 days ago

This looks "fun" too: commit fixing a small typo -> the app broke.

link

lvncelot 402 days ago

So nothing's changed, then :D

link