Hacker News new | ask | show | jobs
by dotancohen 2780 days ago
Those are HTML entities. Most modern programming languages come with tools to decode this, e.g. in python:

    text = urllib.parse.unquote(text)
1 comments

urllib.parse.unquote() is unrelated to HTML. It undoes URL-encoding:

https://docs.python.org/3/library/urllib.parse.html#urllib.p...

In Python ≥ 3.4, you can use html.unescape() to decode HTML entities:

https://docs.python.org/3/library/html.html#html.unescape

You are 100% correct. I mixed the two encodings up. Thanks.