| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by textninja 786 days ago

I was imagining the message encoded in clear text, not encrypted form, because given the lengths required to coordinate protocol, keys, weights, and so on, I assumed there would be more efficient ways to disguise a message than a novel form of steganography. As such, I approached it as a toy problem, and considered detection by savvy parties to be a feature, not a bug; I imagined something more like a pirate broadcast than a secure line, and intentionally ignored the presumption about the message being encrypted first.

That being said, yes, some of my assumptions were incorrect, mainly regarding temperature. For practical reasons I was envisioning this being implemented with a third party LLM (i.e. OpenAI's,) but I didn't realize those could have their RNG seeded as well. There is the security/convenience tradeoff to consider, however, and simply setting the temperature to 0 is a lot easier to coordinate between sender and receiver than adding two arbitrary numbers for temperature and seed.

I misspoke, or at least left myself open to misinterpretation when I referred to the LLM's weights as a "secret key"; I didn't mean the weights themselves had to be kept under wraps, but rather I meant that either the weights had to be possessed by both parties (with the knowledge of which weights to use being the "secret") or they'd have to use a frozen version of a third party LLM, in which case the knowledge about which version to use would become the secret.

As for how I might take a first stab at this if I were to try implementing it myself, I might encode the message using a low base (let's say binary or ternary) and make the first most likely token a 0, the second a 1, and so on, and to offset the risk of producing pure nonsense I would perhaps skip tokens with too large a gulf between the probabilities for the 1st and 2nd most common tokens.

1 comments

eru 786 days ago

> I was imagining the message encoded in clear text, not encrypted form, [...]

I was considering that, but I came to the conclusion that it would be an exceedingly poor choice.

Steganography is there to hide that a message has been sent at all. If you make it do double duty as a poor-man's encryption, you are going to have a bad time.

> As such, I approached it as a toy problem, and considered detection by savvy parties to be a feature, not a bug; I imagined something more like a pirate broadcast than a secure line, and intentionally ignored the presumption about the message being encrypted first.

That's an interesting toy problem. In that case, I would still suggest to compress the message, to reduce redundancy.

link

textninja 786 days ago

> If you make it do double duty as a poor-man's encryption, you are going to have a bad time.

For the serious use cases you evidently have in mind, yes, it's folly to have it do double duty, but at the end of the day steganography is an obfuscation technique orthogonal to encryption, so the question of whether to use encryption or not is a nuanced one. Anyhow, I don't think it's fair to characterize this elaborate steganography tech as a poor-man's encryption — LLM tokens are expensive!

link

eru 786 days ago

> Anyhow, I don't think it's fair to characterize this elaborate steganography tech as a poor-man's encryption — LLM tokens are expensive!

I guess it's a "rich fool's encryption".

link

textninja 786 days ago

Haha, sure, you can call it that if you want, but foolish is cousin to fun, so one application of this tech would be as a comically overwrought way of communicating subtext to an adversary who may not be able to read between the lines otherwise. Imagine using all this highly sophisticated and expensive technology just to write "you're an asshole" to some armchair intelligence analyst who spent their afternoon and monthly token quota decoding your secret message.

Seed for the message above is 42 by the way.

(Just kidding!)

link