Hacker News new | ask | show | jobs
by sameers 66 days ago
That was very illuminating! Do you think you'll try experimenting with some sort of adversarial "agent" setup, where the code isn't released until it passes security review by itself, for each model you are comparing?