Hacker News new | ask | show | jobs
by expression 3676 days ago
Haven't been following this blog, but one thing caught my eye while skimming the post:

>The implementations of these methods need to modify the correct bits of the u16 without touching the other bits. For example, we would need the following bit-fiddling to set the stack index:

    self.0 = (self.0 & 0xfff8) | stack_index;
>Or alternatively:

    self.0 = (self.0 & (!0b111)) | stack_index;
>Or:

    self.0 = ((self.0 >> 3) << 3) | stack_index;
>Well, none of these variants is really readable and it’s very easy to make mistakes somewhere. Therefore I created a BitField type with the following API:

    self.0.set_range(0..3, stack_index);
>I think it is much more readable, since we abstracted away all bit-masking details.

Personally, I disagree. Whilst it is easy to make mistakes when writing bit-fiddling code (there seems to be one in the second example, unless Rust has on of the more weird !-operators in programming languages), or any code for that matter, the meaning of all the bitwise operations is easy to recognise.

Without knowing the API, the transformed line rings no bells. How many APIs do you want to learn?

7 comments

Ranges are used in a lot of places in Rust (over sequential collections, for example). Bit operations are used to engage with low level system APIs like this or for modular arithmetic, which is not common in Rust. As A Rust user, I absolutely know what a range is, and find the `set_range` function very intuitive. I have to drop into messing around with bitwise AND only rarely, and as someone who only wrote a small amount of C before Rust, it is always an exhausting effort.
And I find the bit manipulations easier to grok (then again, I know several different assembly languages and have been programming in C for years).

As for the bit ranges, it looks backwards to me, since bits are normally numbered high to low. I would find it easier to read if it read:

    self.0.set_range(3..0, stack_index);
I can "see" that much easier.
Well, Rust supports both bitwise manipulations and a range-based interface, so people from various backgrounds can use whichever they find more natural. :-)
Agreed, definitely should go high to low.
Yeah, it depends on personal taste. However, Ranges [1][2] are a core type of Rust and used pretty frequently, so it shouldn't be completely alien to Rust users…

[1]: https://doc.rust-lang.org/book/iterators.html [2]: https://doc.rust-lang.org/nightly/std/ops/struct.Range.html

My feeling is that the clearest way to do this would be:

    self.0.stack_index = stack_index;
using an appropriate packed struct/union type such that this makes sense.

This best represents what you're actually doing, instead of faffing around with bit twiddling to achieve the same result.

Does Rust not allow something like this?

This is indeed a rust failing. There are no untagged unions.

There should be. Obviously, unions with pointers are special. Some code might need to be marked unsafe. I suspect there is a subset of pointer-in-union operations that could be made safe, possibly either changing or using the pointer but not both.

Another thing missing is bifields. I'd love to see this done right, with adjustable packing so that one can use them to write an emulator or file parser.

Bitfields are in a crate, untagged unions are in the RFC process.
Crates and cargo look like trouble:

I'm not going to develop while connected to the internet. This is not allowed by any employer that cares about security.

I prefer to use the package manager that comes with my OS. Every other installer (firefox, nvidia crap, etc.) risks screwing things up.

Given the two things above, I hope you can see how much of a pain it would be to run some non-standard installer. I'd need to ask IT to mirror the rust cargo stuff... and is this even possible? Would I have to dig deep into the rust stuff to change a public key?

With it being a crate, I worry about it going unsupported. I've seen this all too often with firefox extensions. If not that, maybe somebody will come along and change the API.

I couldn't even find the crate. I found one called bitflags that does 1-bit fields. That isn't suitable. I'm looking for something that can handle stuff like an x86-64 descriptor table (GDT, LDT, IDT) entry or a PowerPC opcode. The PowerPC mtspr and bl instructions are interesting examples: mtspr has a 10-bit split bitfield, with a pair of 5-bit halves in the wrong order, and bl has a field that kind of has the low 2 bits stolen by other fields. The x86-64 descriptors have lots of strange-sized split fields and they are tagged unions for which the tag layout is hardware-defined. Another good one is page table entries, from the perspective of: emulator, hypervisor, OS.

With this not being part of the language proper, I have to wonder how well it performs. Has anybody compared the resulting assembly code?

  > I'm not going to develop while connected to the internet.
You don't have to. If you do use crates from crates.io, then you need to be connected the first time, to get everything. Not after that.

   > ... and is this even possible?
Running your own registry with your own universe of crates is very possible. Mirroring the external world inside that one is not that easy.

  > I couldn't even find the crate. ... that isn't suitable.
Well, multi-bit fields are made up of one-bit fields... if you're looking for GDT, LDT, IDT stuff, the crate referenced in the article already has all of that defined for you. If you want to define it yourself, well, I did it with a struct and a method that handles the bit shifting.

  > I have to wonder how well it performs.
Well, Rust is itself implemented in Rust, the standard library is all in Rust, so.

Also, finally, you don't have to use crates.io to use Cargo. You can just specify crates that are on your local filesystem, or a git repository hosted anywhere.

I was just about to post this exact same thing. Now there's an introduction of ranges and another API that is necessary to understand in order to grok this code. Bit twiddling, ANDing, and ORing are specific and absolute.

For example, what happens when I specify a range of 0..100 for a 32 bit int? what if the value is too big for the range? APIs around small operations like this introduce ambiguity that wasn't present before.

>Bit twiddling, ANDing, and ORing are specific and absolute

So what? You could implement addition using bitwise operators, but it makes for much more readable and less error-prone code to abstract it into an addition operator. Code should clearly express intent, and if your intent is to set, say, bits 5 though 9 of some register to some value, the clearest expression of that is something like (given a right-open range convention, which is well-established in this particular language):

    register.set_bits(5..10, value);
If I look at the above statement, I know what the author of the code intended it to do and can rely on (or manually verify once) the correctness of the implementation of "set_bits." If instead I'm reading code like

    register = (register & 0xFC1F) | (value << 5);
I have to work out in my head what the bit fiddling is doing, and from that guess what the author meant the code to do. I can't know whether he or she made a mistake in translating the intent to bitwise operations (say I know from context that value is less than 8; did they really mean to clear bits 8 and 9?), because the intent isn't expressed anywhere.

>For example, what happens when I specify a range of 0..100 for a 32 bit int?

What happens is that you've violated a precondition of the API. Because this is Rust, I would expect the operation to be implemented in such a way as to panic (in debug mode), or just fail to compile, since selecting a bit range like this at run time is extremely uncommon.

>what if the value is too big for the range?

Same thing as if you weren't using the abstraction around the bitwise operators: either you'd clobber higher bits, or the implementation would truncate the value.

Again, though, I would expect a debug-mode panic.

>APIs around small operations like this introduce ambiguity that wasn't present before.

I'm not sure what "ambiguity" you see here. There's the issue of whether the range is right-open or -closed, but again right-open ranges are well-established in Rust.

A range of 0..100 could be a static error. There's nothing fundamentally ambiguous about the technique, eg you can shift by 100 already.
Every other project comes up with its own wrapper for bit wise operations. Its not that big a piece of code to find and understand. Its a matter of taste. The first one is the canonical form (in C) and the only error source is the hex.
! is both bitwise and boolean negation, depending on the type (bool vs some numeric)
>Personally, I disagree. Whilst it is easy to make mistakes while writing bit-fiddling code

Luckily Rust gives you the #[test] directive so the next line after abstracting away the bit-flipping details you can provide it works correctly.