Hacker News new | ask | show | jobs
by ZooCow 1129 days ago
How do we know that these are the actual confidential rules it follows rather than text it “made up” given the prompt?
2 comments

Not foolproof, but you could get fairly high confidence by trying different variations of the prompt and seeing how consistent the output is. If it's the same every time, chances are it's being copied verbatim from somewhere.
You could run tests against how well each rule is implemented.