Hacker News new | ask | show | jobs
by canttestthis 943 days ago
> The only way to have an LLM where data and instruction never meet, in my view, is one that doesn't update in response to prompts (and therefore can't carry on a multi prompt conversation).

Do you mean an LLM that doesn't update weights in response to prompts? Doesn't GPT-4 not change its weights mid conversation at all (and instead provides the entire previous conversation as context in every new prompt)?

1 comments

No, use an encoder/decoder transformer, for example: prompt goes on encoder, is mashed into latent space by encode, then decoder iteratively decodes latent space into result.

Think like how DeepL isn't in the news for prompt injection. It's decoder-only transformers, which make those headlines.