|
|
|
|
|
by account42
301 days ago
|
|
> Eg maybe insert “a” at position 0 is valid, but inserting at position 1 would be invalid because it might insert in the middle of a codepoint. You have the same problem with code points, it's just hidden better. Inserting "a" between U+0065 and U+0308 may result in a "valid" string but is still as nonsensical as inserting "a" between UTF-8 bytes 0xC3 and 0xAB. This makes code points less suitable than UTF-8 bytes as mistakes are more likely to not be caught during development. |
|
> This makes code points less suitable than UTF-8 bytes as mistakes are more likely to not be caught during development.
Disagree. Allowing 2 kinds of bugs to slip through to runtime doesn’t make your system more resilient than allowing 1 kind of bug. If you’re worried about errors like this, checksums are a much better idea than letting your database become corrupted.