| HN Mirror

Sorry I wasn't very complete with my description. I mean that 0,0,0,0... would correspond with the "most probable" continuation of some prompt and it would map to sensical english. And then 48298346,1,3,2... would correspond with a less probable continuation of the prompt, but it would also map to sensical english. But where more vs less probable, and the associated probabilities, are only knowable by someone with access to the secret LLM.

So you'd feed the algorithm some starter text like: "Here's my favorite recipe for brownies", and then you'd give it some data to encode, and depending on which data you gave it, you'd get a different, but "plausible", recipe for brownies. The recipient could reverse the recpie back into numbers, and from that they'd decode the hidden message.

The trick would be balancing the LLM's attempt to make sense against whatever additional constraints came along with your data encoding scheme. If you tried to encode too much cyphertext into a too-short brownies recipe, the recipe would fail to be convincingly a recipe. Conveniently, it's conventional to prefix recipes with a tremendous amount of text that nobody reads, so you've got a lot of entropy to play in.