Hacker News new | ask | show | jobs
by Someone 3399 days ago
https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt:

"Below are the guidelines that were used in defining the UCS transformation format: [...] 6) It should be possible to find the start of a character efficiently starting from an arbitrary location in a byte stream."

If they used "10" as a marker for "this is the start of a two-byte sequence", it could not have been used for "this is a byte in a multi-byte sequence, but not the first one"