Hacker News new | ask | show | jobs
by DaiPlusPlus 391 days ago
> I genuinely am not sure if the answer lies in sanitization of input or output in this case

(Preface: I am not an LLM expert by any measure)

Based on everything I know (so far), it's better to say "There is no answer"; viz. this is an intractable problem that does not have a general-solution; however many constrained use-cases will be satisfied with some partial solution (i.e. hack-fix): like how the undecidability of the Halting Problem doesn't stop static-analysis being incredibly useful.

As for possible practical solutions for now: implement a strict one-way flow of information from less-secure to more-secure areas by prohibiting any LLM/agent/etc with read access to nonpublic info from ever writing to a public space. And that sounds sensible to me even without knowing anything about this specific incident.

...heck, why limit it to LLMs? The same should be done to CI/CD and other systems that can read/write to public and nonpublic areas.