Hacker News new | ask | show | jobs
by comex 2465 days ago
I don't entirely agree. As an analogy, suppose that you want to pass a pointer to an object as a function argument. As the programmer, you expect the object will necessarily remain allocated as long as the function uses it, since it's kept alive elsewhere, but you might have made a mistake. Before Rust, your options would be:

1. Use a C-style raw pointer. This has no overhead but is unsafe.

2. Use a reference counted or garbage collected pointer. This provides safety but is a non-zero-overhead abstraction. In other situations, reference counting can be necessary to manage unpredictable object lifetimes, but in this case we're assuming it would only be used to guard against mistakes, so the cost represents overhead. Even if some language implements reference counting just as fast as you can implement reference counting in C, that doesn't make it zero-overhead.

But Rust adds a new option:

3. Use a borrow-checked reference. This is safe, yet zero-overhead compared to the unsafe option, as long as your usage pattern can be expressed in terms of Rust lifetimes.

Going back to array indexing, the analogous situation is that you want to access an index that you, the programmer, expect to always be in bounds. Option 1 corresponds to unchecked indexing, and option 2 corresponds to bounds-checked indexing. But Rust brings nothing new in this area: there is no option 3.

Yet an option 3 is theoretically possible: safe unchecked indexing if you can prove to the compiler that the index can never be out of bounds. This can be done in languages with dependent types or built-in proof systems. (It can even be done in Rust with a certain neat hack [1], but with a lot of limitations.)

I'm not saying Rust is bad for not having dependent types. As far as I know, it's an open research problem how to make those features feel ergonomic and easy to use, even to the extent that Rust lifetimes are (i.e. not totally). And on the flipside, bounds checks usually don't cause very much overhead in practice, thanks to CPU branch prediction, plus the optimizer can sometimes remove them.

But I'd say that Rust's choice not to go there (...yet...) justifies calling its current approach a non-zero-overhead abstraction.

[1] https://github.com/bluss/indexing

1 comments

Bounds-checked indexing isn't an abstraction, it's a safety feature. The abstraction is compared to manually doing bounds checks like you would in C.

  // C safe subscript
  if (i < size) { return buf[i]; } else { abort(); }

  // Rust safe subscript
  return buf[i];
That's the abstraction, and Rust introduces no overhead here.
This is such a useless definition of "zero-cost" abstraction that almost everything qualifies for it.
It's literally the definition. And very little qualifies. An abstraction is zero-cost if there's no runtime penalty, including if the code you'd write by hand to accomplish this goal is no better than what the compiler synthesized for you via the abstraction.

If the use of the abstraction forces your data model into a suboptimal representation, it's not zero-cost. If the compiler emits code that's worse than the straightforward manual implementation, it's not zero-cost. If the emitted code involves runtime checks that aren't necessary without the abstraction, it's not zero-cost.

For example, reference-counting is an abstraction over tracking the lifetime of your object. In some cases (where ownership isn't clear) the retain count introduced by reference counting is required and therefore not a cost, but in most cases the point at which the object ends up freed is actually predictable, and therefore the cost of all the reference counting is something that would have been avoided without the abstraction. Therefore, reference-counting as a replacement for manual alloc/free is generally not zero-cost.

Or how about iteration. Rust has an external iterator model (where you construct an iterator and run that, rather than passing a callback to the collection). In most languages an external iterator model is a non-zero cost, because you need to construct the iterator object and work with that, which is a bit of overhead compared to what the collection could do with an internal iterator model. In Rust, external iterators are frequently zero-cost because they usually get inlined. So you can write a high-level loop that includes a filter and map and the end result is very frequently just as good as (if not better than) what you'd have done if you wrote the iteration/filter/map by hand without Rust's iterator abstraction.