| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wholesomepotato 973 days ago
	Might be wrong but this shortcut corrupts the lower bits with garbage from the higher byte. The lookup table can detect some, but not all errors, so yeah, it relies on valid input.

1 comments

RaisingSpear 973 days ago

I can't see how it's any different. If you have 0x0y0z, a shift+or gives 0x0yxz, so the result is same as what you have, just with fewer operations.

link

wholesomepotato 972 days ago

It's unclear what is in the highest byte, so I assume not 0x000x0y0z, but 0xab0x0y0z where ab is unknown (in the past comment I used XX for this). If highest byte is known, then sure, even better.

link

RaisingSpear 972 days ago

The '& 0xfff' eliminates the highest byte, so it doesn't matter what it is.

Your code doesn't handle the 'length' parameter, so the problem isn't the highest byte, it's bytes beyond 'length'.

link

wholesomepotato 971 days ago

I think you're right. Even better. Did you have time to bench it, etc.?

I see what you mean by length. I just skimmed over the text originally as I don't have time for rather lame problems like this. I'd just add 3 bits of length to be part of the index, job done. 12KB lookup table instead of 4KB, assuming 0 is not a valid value (negate to avoid needing 0b11).

link

RaisingSpear 970 days ago

Adding the length means another shift+or operation at minimum. I already think this is slower than the technique presented in the article, and this would make it worse.

It's an interesting idea, but I don't see it being practical, even if the size of the table wasn't an issue.

link

wholesomepotato 970 days ago

Instruction level parallelism will make extra shift free. Other than this it needs to be benched and might depend on cpu/arch. I don't care enough to bench and optimize further.

link