| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by akoboldfrying 160 days ago
	Certainly! What surprised me was that apparently LLMs are deliberately designed to enable multiple ways of encoding the same string as tokens. I just assumed this would lead to inefficiency, since I assumed that it would cause training to not know whether it should favour outputting, say, se\|same or ses\|ame after "open", and thus throw some weight on each. But provided there's a deterministic rule, like "always choose the longest matching token", this uncertainty goes away.

1 comments

bonzini 159 days ago

LLMs are probabilistic black boxes, trying to inject determinism in their natural language processing (as opposed to e.g. forcing a grammar for the output) may very well screw them over completely.

link

akoboldfrying 159 days ago

LLMs are ultimately just matrix multiplication and some other maths, nothing about them is inherently nondeterministic. When nondeterminism is present, it's because it was deliberately sprinkled on top (because it tends to produce better results).

link

bonzini 158 days ago

Yes determinism is not the best word. What I mean is that if you force the LLM to output "carr+o" even when it prefers "carro", this could result in worse quality output.

link