Hacker News new | ask | show | jobs
by pjc50 61 days ago
The fun bit is right at the start when the author notices that the compiler spots this and optimizes it away.

We didn't get into the deeper question of benchmarking it vs. a three-register swap, because I suspect the latter would be handled entirely by register renaming and end up being faster due to not requiring allocation of an ALU unit. Difficult to benchmark that because in order for it to make a difference, you'd need to surround it with other arithmetic instructions.

A meta question is why this persists. It has the right qualities for a "party trick": slightly esoteric piece of knowledge, not actually that hard to understand when you do know about it, but unintuitive enough that most people don't spontaneously reinvent it.

See also: https://en.wikipedia.org/wiki/Fast_inverse_square_root , which requires a bit more maths.

The other classic use of XOR - cursor overdrawing - has also long since gone away. It used to be possible to easily draw a cursor on a monochrome display by XORing it in, then to move it you simply XOR it again, restoring the original image.

10 comments

Reminds me of the time developing the 747-8 avionics and the systems engineers started observing a bug where the entire box would just stop randomly, all output stopped, including HW traces...the thing would just halt.

About a month later an engineer decided to turn on all warnings for gcc...and behold a message stating something to the effect of "WARNING: execution will halt upon reaching this statement." The compiled code basically just halted and never returned from the function call (can't remember the specifics).

And that is how we learned not to hide compiler messages.

`-Wall` is your friend and boy it is a major irritant using third party sloppy libraries when statically compiling it all.
There are even more options you might also want to consider: https://stackoverflow.com/questions/73310403/whats-the-diffe...
I'm actually a little horrified to learn that warnings were suppressed while developing code to control a 747.
Right out of college, one of my first job offers was to work on the compiler for the computer for the Space Shuttle. Apparently because I had once taken a compiler course. Even young, naive, optimistic me thought to question the wisdom of that offer.

I ended up not taking it because the pay wasn’t great (and at the time it wasn’t really what I wanted to do), but part of me is still curious about what that would have been like.

And it took them a month to figure this out.
This is one of those things that is hard to do, but great to have done. It's hard to do with a codebase in flight with lots of people working on it.

What I remember implementing this on projects was the messiness of:

- incrementally getting the Makefiles to turn on -Wall file-by-file as they were scrubbed. I think it was something similar to "<list-of-files>: CFLAGS+=-Wall" and then add to the list.

- suppressing warnings that were "ok" on a case-by-case basis. different languages had different ways of saying "ignore error 123 here" if at all.

- I remember lint had things like this too, like /NOTREACHED/

maybe things have gotten better/cleaner.

Better to turn every warning into an error.
For releases! But have a flag to disable it: don't make people edit -Werror out while hacking, that's really annoying.
The cursor overdrawing trick in part started going away before it's time thanks to patent enforcement (one of the lawsuits infamously exacerbated Commodores financial woes towards the end)... By the time the patent expired there was really no longer much value in going back to it.
> A meta question is why this persists. It has the right qualities for a "party trick"

This is discussed in a section near the end. But I feel like the discussion of why people care about using XOR to swap two values is missing an underlying discussion of why people would care about swapping values. As shown earlier in the division example, at the point where you would do the swap, typically you can just write the rest of the code with the roles of the values swapped.

Usually it's used like:

   if max < min:
       min, max = max, min

   [... algorithm that requires min < max]
So your suggestion would be to have two versions of the algorithm in the two branches of the 'if'. This is significantly more complicated and may even be slower depending on lots of factors.
> typically you can just write the rest of the code with the roles of the values swapped

Today, yes; most modern instruction sets are pretty orthogonal, and you can use a value in any register symmetrically -- although even today, division instructions (if they exist!) are among the most likely to violate that expectation, along with instructions that work with the stack pointer. But in the XOR heyday, this was less true -- instruction sets were less orthogonal, and registers were more scarced. It's not unreasonable for an OS scheduler tick to do some work to figure out the newly-scheduled task's stack pointer in one register, and need to swap it into an SPR or similar so that the return from interrupt returns to the new location, for example; and this is the exact type of place where the XOR trick occasionally has value.

The "party trick" is more general than just xor, for example:

    void swap(unsigned* a, unsigned* b)
    {
        *a = *a + *b;
        *b = *a - *b;
        *a = *a - *b;
    }
GCC still sees through it with restrict: https://godbolt.org/z/8n3bGha3e
Yea this is operating on the pointer to the values, but if you use just an `unsigned a` it uses xor

clang: https://godbolt.org/z/5PEMTrxba

gcc: https://godbolt.org/z/7j8h4zo4v

This is covered in TFA.
This has overflow issues unlike xor.
Depending on what you think the "issue" is, one of:

* wrapping is well-defined behavior for unsigned integers; signed integer wrapping is UB, but is not used here.

* the equations (that a & b cancel each other out, resulting in the swap) hold even when done in a mod N ring.

You can even do it with complex images in a multi-color bitmap. I do it with my 6502 SBC setup: https://github.com/jimjag/JJ65c02/blob/main/Software/pico-co...
Another similar trick - XOR doubly linked list: XOR the prev and next pointers (storaged size of node decreases) and you can recover the values when accessing the node from prev or next side and thus already know one of the addresses.
This, too, is covered in TFA.
>The other classic use of XOR - cursor overdrawing - has also long since gone away.

But that's a specific example of a general use that still is handy occasionally. A = B XOR C. Come back later with C and A and retrieve B.

the "XOR in hash functions" mentioned at the end most certainly refers to Zobrist hashing, the actually useful XOR trick.

Requires fixed-length keys. Generate random table for each byte position. Then to hash a key, for each position lookup byte in table and XOR the results. This is used in chess/go/shogi/... engines because the board/position representation is fixed-length and you can undo modes easily - XOR-out byte from previous state and XOR-in from new state.

When I opened the article, I have immediately searched for "register renaming"...
> register renaming

is register renaming something done in hardware, or by the compiler?