Hacker News new | ask | show | jobs
by anonymoushn 540 days ago
You can test whether a register full of bytes belong to a class of bytes with the high bit unset and distinct lower nibbles in 2 instructions (each with 1 cycle latency). For example the characters that commonly occur in text and must be escaped in json are "\"\\\r\n\t" (5 different bytes). Their lower nibbles are:

\": 0010

\t: 1001

\n: 1010

\\: 1100

\r: 1101

Since there are no duplicate lower nibbles, we can test for membership in this set in 2 instructions. If we want to get the value into a general-purpose register and branch on whether it's 0 or not, that's 2 more instructions.

For larger sets or sets determined at runtime, see here: http://0x80.pl/notesen/2018-10-18-simd-byte-lookup.html