Hacker News new | ask | show | jobs
by Scoundreller 1289 days ago
or paste into notepad.exe, copy back into whatever you were using.

Voila!

2 comments

That will almost certainly preserve the invisible characters. Most invisible characters are used for some kind of in-line formatting in Unicode, so it's not desirable to remove them.
What inline formatting in notepad.exe? It doesn't even support bolding/italics/underling.

But I guess there are tabs and line return/carriage returns, so there's that.

Right-to-left/left-to-right markers. Language tags. Various invisible spaces. Homoglyphs. (all trivially filterable though)
I've already got a script running every 2500 milliseconds to strip leading and trailing whitespace, HTML, and non-ASCII characters except for the UTF-8 characters of our local language.