Hacker News new | ask | show | jobs
by JTon 955 days ago
> because all the rules have been pushed out.

Can you unpack this a little please? Is it possible to ELI5 the mechanisms involved that can "push" a rule set out? I would have assumed the rules apply globally/uniformly across the entire prompt

1 comments

> ELI5

The model can look at X amount of input to decide what words come next.

Normally, Google fills part of X with instructions, and you control the other part.

However if you give it exactly X amount of input, then there's no room for Google's original instructions, and you control it all.

Thanks! So is patching this as simple as not allowing the entire space of X for user prompt? i.e. guaranteeing some amount of X for model owner's instructions
No. The input and the output are the same thing with transformers. Internally, you're providing them with some sequence of tokens and asking them to continue the sequence. If the sequence they generate exceeds their capacity, they can "forget" what they were doing.

The "obvious" fix for this is to ensure that the their instructions are always within their horizon. But that has lots of failure modes as well.

To really fix this, you need to find a way to fully isolate instructions, input data, and output.

>So is patching this as simple as not allowing the entire space of X for user prompt?

>No

Isn't the answer yes?

>The "obvious" fix for this is to ensure that the their instructions are always within their horizon.

That's what I take GP to be suggesting. Any possible failure mode that could result from doing this is less serious than allowing top-level instructions to be pushed out, surely?