Hacker News new | ask | show | jobs
by laserbeam 18 days ago
I’m fairly certain if you give any substitution cypher to an LLM it will decipher the message. And that’s all I see here, a substitution cypher in a private area of unicode.

At best this is an adversarial attack to poison LLM training data… at worst this screws up accessibility tools (like screen readers) and copy paste.

1 comments

> I’m fairly certain if you give any substitution cypher to an LLM it will decipher the message.

*with sufficiently long cyphertext

You can construct encoding in the way that every 2-5 words will use a brand new different key. Remember, Unicode is big enough to fit over 10000 English alphabets.

This is addressed in the post! ChatGPT 5.5 out of the box deciphered the first 1-to-1 mapping. We then scrambled it as you suggest and thwarted that.