Show HN: Fincher, a steganography tool for text

Y	Hacker News new \| ask \| show \| jobs

	Show HN: Fincher, a steganography tool for text (github.com)
	40 points by m4xm4n 2733 days ago

5 comments

fredley 2733 days ago

Very interesting tool, although storing as typos does seem to be a bit visible and prone to mistaken 'correction'. Other approaches to consider might be:

* Changing punctuation for visually identical, but different characters. This would not work for printed documents however.

* Encoding only 'believable' typos, e.g. it's its. You could encode a binary stream across all instances of it(')s, or other substitutions.

* Encoding the stream in whitespace, e.g. Two/One spaces after a full stop. Printed documents would be lossy though (as full stops at line endings would be ambiguous). There are error detection/correction systems that can help though.

link

bambax 2733 days ago

Typical OCR errors would be interesting too: confusion between the letter "n" with the letters "ri" for example.

It would be visually challenging to detect (and also, maybe, difficult for an OCR engine).

link

nrjames 2733 days ago

Snow is interesting and uses white space instead. http://www.darkside.com.au/snow/

link

jwilk 2733 days ago

Discussed on HN:

https://news.ycombinator.com/item?id=17524693

link

m4xm4n 2733 days ago

Yeah, I need to work on making the displacements and replacements a bit more context-aware (& probably linguistically aware). There are cases where it can "replace" a character with the same character, for example.

I do like your idea about visually similar but distinct character replacement. That would be a really fun one to implement.

link

wstuartcl 2730 days ago

I worked on something very similar, my version also mutated punctuation and common phrases/words with synonyms and sentence re-ordering. Instead of steganography the purpose was to create identifiable mutations in text acting as a canary to tie disclosures back to specific recipients. Each party receiving a confidential document had slight mutations unique to their own document and given a copy/paste from a fairly small fragment(s) could be used to identify the owner of the version.

link

matt_the_bass 2730 days ago

This seems like a useful tool. Is it a product?

link

wstuartcl 2722 days ago

No Sorry it was constructed to catch an employee leaking confidential company information to media. I do not know how you could make this into a product and still maintain its reliability -- the more widely known the mutations are the easier it would be to mitigate the watermarking.

link

sehugg 2733 days ago

I did one of these many years ago, basically just abusing lex/flex: https://github.com/countrygeek/stegparty/blob/master/stegpar...

link

josephcar 2733 days ago

This is similar to steganos (https://github.com/fastforwardlabs/steganos), which tries to limit itself to changes that do not change the meaning of the text.

link

m4xm4n 2733 days ago

Oh, very cool! I like the data model for the changes. I've been thinking about adding an analysis pass using something similar to make it possible to implement more sophisticated strategies. The tricky bit will be retaining the stream-based approach.

link

awinter-py 2733 days ago

first crystal codebase I've seen! niccce.

link