Hacker News new | ask | show | jobs
by yawpitch 923 days ago
Sure… and so far there always appears to be a way of breaking that fine tuning; see the the recent paper on training data extraction I linked in another comment below.
1 comments

There's a big difference between being breakable and being representative of the web content used for training like you claimed earlier.
Again, see the link… get it to repeat the same word and it will give you back its raw training data. We’re still discovering the (potentially limitless) ways these things can be tricked into regurgitating what they were trained on; it’s entirely possible there’s no way to stop them doing so.