|
|
|
|
|
by Avernar
3212 days ago
|
|
You might be correct and 32 bits could have been enough but Unicode has restricted code points to 21 bits. Why? Because of stupid UTF-16 and surrogate pairs. I'm curious why you think that UTF-8 requires complicated lookup tables. |
|
Because in the end it's still a Unicode encoding, and still has to deal with BS like "equivalence", right?
Which is not mechanically encoded in the err, encoding (e.g. all characters with the same bit pattern there are equivalent) but needs external tables for that.