Hacker News new | ask | show | jobs
by everlier 214 days ago
There was another technique "klmbr" a year or so ago: https://github.com/av/klmbr At a highest setting, It was unparseable by the LLMs at the time. Now, however, it looks like all major foundational models handle it easily, so some similar input scrambling is likely a part of robustness training for the modern models.

Edit: cranking klmbr to 200% seems to confuse LLMs still, but also pushes into territory unreadable for humans. "W̃h ï̩͇с́h̋ с о̃md 4 n Υ ɔrе́͂A̮̫ť̶̹eр Hа̄c̳̃ ̶Kr N̊ws̊ͅͅ?"

1 comments

While these methods may be helpful for the moment, there is no reason to think the model won't be able to train past it far faster than your average user will figure out how not to be plagued with problems caused by these methods.

In some ways we're reaching the 'game over' stage where models converge on human like input understanding, in which the only way to beat the models is to make it illegible to humans.