| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by WithinReason 69 days ago
	the model generates probabilities for the next token, then you set the probability of not allowed tokens to 0 before sampling (deterministically or probabilistically)

2 comments

vrighter 65 days ago

but some tokens are only not allowed in certain contexts, not others.

You might be talking about how to defuse a bomb, instead of building one. Or you might be talking about a bomb in a video game. Or you could be talking about someone being "da bomb!". Or maybe the history of certain types of bombs. Or a ton of other possible contexts. You can't just block the "bomb" token. Or the word explosive when followed by "device", or "rapid unscheduled disassembly contraption". You just can't predict all infinite wrong possibilities.

And there is no way to figure out which contexts the word is safe in.

link

WithinReason 65 days ago

I'm responding to:

> Fundamentally there's no way to deterministically guarantee anything about the output.

with the fact that you can e.g. force a network to output e.g. syntactically correct code, as long as you can syntax check each token.

link

vrighter 65 days ago

You just said an oxymoron right there.

If you're syntax checking every token, you're doing it AFTER the llm has spat out its output. You didn't actually do anything to force the llm to produce correct code. You just reject invalid output after the fact.

If you could force it to emit syntactically correct code, you wouldn't need to perform a separate manual syntax check afterwards.

link

WithinReason 65 days ago

No, you disallow the LLM to generate invalid tokens. That means you "force it to emit syntactically correct code"

link

vrighter 64 days ago

how do you disallow it from generating specific things? My point is that you can't. And again, how do you stop it generating certain tokens, but only in certain contexts?

link

WithinReason 64 days ago

E.g. you ask it what's 2+2, and only allow it to generate digits in the response. Set other probabilities to 0, then sample the rest. This is trivial.

link

PunchyHamster 69 days ago

but filtering a particular token doesn't fix it even slightly, because it's a language model and it will understand word synonyms or references.

link

WithinReason 69 days ago

I'm obviously talking about network output, not input.

link

zbentley 64 days ago

Good-token/bad-token overlap is near 100%. For example, try interacting with quantitative data, or program code, without using these tokens:

> :(){ :|: & };:

Now try running that in your shell.

link

PunchyHamster 68 days ago

which you can affect by just telling it to use different wording... or language for that matter

link