Hacker News new | ask | show | jobs
by WhiskeyChicken 940 days ago
If the training data contains sufficient examples of deception being used when doing illegal stuff, wouldn't this be what we'd expect to see, given that it can't actually reason about what "explicitly allowed" really means? (Forgive my ignorance if this makes no sense, I am not versed well in generative AI.)