|
|
|
|
|
by pjz
365 days ago
|
|
The problem with the 'characters are just numbers' approach is that they're not _just_ numbers.. with the advent of unicode, they're _sequences of numbers_... so bytes thus can mean different things when part of a a sequence than when standalone. That said, since they're numbers, we should use the most efficient checks for them... which are likely vectorized SIMD assembly instructions particular to your hardware. And which I've seen no one mention. |
|