Hacker News new | ask | show | jobs
by XenophileJKO 234 days ago
This isn't likely to be a hardcoded type of classified response. I think this response is literally "you offended the model persona's sensibilities." But, yes after the first denial the models will double down.