|
|
|
|
|
by Terr_
51 days ago
|
|
I worry that "boiling" is still optimistic, since it isn't as simple or foolproof. It's more like a complex fermentation process, where it's possible for a malicious input to hijack how it works and generate something more dangerous than what you put in. Even if the output is only shown to a human, imagine a comment in a thread that tricks an LLM into "summarizing" a false account where other innocent people said terrible ban-worthy things. |
|