|
|
|
|
|
by astrange
1503 days ago
|
|
That's anthropomorphizing them - a large language model doesn't have a bottleneck the same way a human does (in terms of being able to express things), it can get on a path where it just outputs memorized text directly and it won't be consistent with what it usually seems to know at all. Also, you could break a discriminator model by running a filter over the output that changes a few words around or misspells things, etc. Basically an adversarial attack. |
|
But yes, you could break the discriminator model, in the same way people disguise their own writing patterns by using synonyms, making different grammar/syntax choices, etc. Building a better evader and building a better detector is an eternal cat and mouse game, but it doesn't reduce the need to participate in this game.