|
|
|
|
|
by chrismorgan
295 days ago
|
|
> indexing by bytes instead of UTF-8 code units When the encoding is UTF-8 (which it is here), the code unit is the byte. They called the fields byteStart and byteEnd, but a more technically precise (no more or less accurate, but more precise) labels would be utf8CodeUnitStart and utf8CodeUnitEnd. |
|