Hacker News new | ask | show | jobs
by creer 950 days ago
Debugging looking for what though? It's interesting trying to think even what the "bug" could look like. I mean, it might be easy to measure arithmetics ability of the LLM. Sure. But if the policy the owner wants to enforce is "don't produce porn", that becomes hard to check in general, and harder to check against arbitrary input from the customer user.

People mention "source data exfiltration/leaking" and that's still another very different one.