Hacker News new | ask | show | jobs
by buckminster 2811 days ago
This is talking about the case where you don't know the encoding. So you don't know which byte sequences are multibyte characters. Whether you use latin1 or bytes the edge cases are exactly the same, and they don't get handled.