"Hacker's Delight" book was really useful for me even though I already had a good grasp on bitwise operators. I sat down with a pencil and went through the bit matrix transpose example and it really opened my eyes to how much interesting stuff can be done this way. I think the key to understanding is to visualize the bitwise operations.
Most of it is pattern recognition and learning common idioms, like multiplying by 0x01010101..., extracting substrings of bits ((x >> y) & z), thinking of values as sets of integers and so on.
Also bitwise manipulations lend themselves well to function composition, so try to keep an eye out for that. For example, if you have a fast function that checks if any byte is zero, you can generalize it to any value by applying xor beforehand. Similarly, the fast bit matrix transpose (which absolutely terrified me) is simply applying the least number of "shift a subset of bits" operations such that each bit travels the required distance.
Honestly just practice. They cost me an interview I really wanted once, so now anytime I stumble across one in a project I take extra time to read and understand it. I also make a point of using bitmasks as function arguments in places where that's a good/reasonable choice.
Start counting in binary on your fingers until you're just about "binary native". Also, perform operations, and think about why they behave the way they do.
It sounds a little woo-woo but going from representing something on paper (or worse, a display) to doing it with my hands really changed it for me. Humans likely evolved intelligence because we stood up and used our hands to manipulate our environment, I suspect we more fully engage our brains in important ways by doing something with our hands when we can.
You might start appreciating the imperial system of measurement though. You've been warned.
Think about all tricks you can so in decimal, but in base 2 instead.
E.g in decimal you can multiply by ten by appending a zero, so in binary you multiply by 2 by doing so. And left shifting by 1 is what appends a zero.
Or you can take approximate log base ten in decimal by counting amount of digits, so in binary you can approximate log2 this way (counting up to most significant one-bit). Etc...
Most of it is pattern recognition and learning common idioms, like multiplying by 0x01010101..., extracting substrings of bits ((x >> y) & z), thinking of values as sets of integers and so on.
Also bitwise manipulations lend themselves well to function composition, so try to keep an eye out for that. For example, if you have a fast function that checks if any byte is zero, you can generalize it to any value by applying xor beforehand. Similarly, the fast bit matrix transpose (which absolutely terrified me) is simply applying the least number of "shift a subset of bits" operations such that each bit travels the required distance.