|
|
|
|
|
by lifthrasiir
433 days ago
|
|
No, I was talking about generated fonts themselves; each glyph would have an associated set of control points which can be used to map a glyph to the correct letter. No need to run the full OCR, you need a single small OCR job per each glyph. You would need quite elaborate distortions to avoid this kind of attack, and such distortions may affect the reading experience. |
|
The HTML is garbage without a correctly rendered webfont that is specific to the shifts and replacements in the source code itself. The source code does not contain the source of the correct text, only the already shifted text.
Inside the TTF/OTF files themselves each letter is shifted, meaning that the letters only make sense once you know the seed for the multiple shifts, and you cannot map 1:1 the glyphs in the font to anything in the HTML without it.
The web browser here is pretty easy to trick, because it will just replace the glyphs available in the font, and fallback to the default font if they aren't available. Which, by concept, also allows partial replacements and shifts for further obfuscation if needed, additionally you can replace whole glyph sequences with embedded ligatures, too.
The seed can therefore be used as an instruction mapping, instead of only functioning as a byte sequence for a single static rotation. (Hence the reference to enigma)
How would control points in the webfont files be able to map it back?
If you use multiple rotations like in enigma, and that is essentially the seed (e.g. 3,74,8,627,whatever shifts after each other). The only attack I know about would be related to alphabet statistical analysis, but that won't work once the characters include special characters outside the ASCII range because you won't know when words start nor when they end.