Hacker News new | ask | show | jobs
Show HN: Steganography in natural language using LLM logit-rank steering (github.com)
2 points by shevis 164 days ago
I have had this idea rattling around my head for a while now. From the readme:

> subtext-codec is a proof-of-concept codec that hides arbitrary binary data inside seemingly normal LLM-generated text. It steers a language model's next-token choices using the rank of each token in the model's logit distribution. With the same model, tokenizer, prefix, and parameters, the process is fully reversible -- enabling text that reads naturally while secretly encoding bytes.

Basically, use the fact that LLMs learn a deterministic probability distribution over next-token generation to create seemingly innocuous ciphertext that is hard to detect.

1 comments

requires fully deterministic inference, which turns out to be unusual, but for this sort of thing it's probably fine if you do really slow inference on cpu. cool idea.