Hacker News new | ask | show | jobs
by jasomill 50 days ago
Tell that to programmers writing code to extract data from PCL print streams by stripping escape sequences and processing the result as "plain text" (in multiple incompatible extended ASCII encodings specified by the stripped escape sequences), or anyone exporting data from Excel in "CSV (Comma delimited) (*.csv)" format.

UTF-8 is everywhere. Until it's not. And it's impossible to distinguish UTF-8 from any other extended ASCII encoding given a sample containing only ASCII characters, so there's still no reliable way to process data that can only be described as "plain text".