Hacker News new | ask | show | jobs
by smj-edison 84 days ago
I feel like encapsulation and composition are in strong tension, and this is one place where it boils over.

I've written a decent bit of Rust, and am currently messing around with Zig. So the comparison is pretty fresh on my mind:

In Rust, you can have private fields. In Zig all fields are public. The consequences are pretty well shown with how they print structs: In Rust, you derive Debug, which is a macro that implements the Debug trait at the definition site. In Zig, the printing function uses reflection to enumerate the provided struct's fields, and creates a print string based on that. So Rust has the display logic at the definition site, while Zig has the logic at the call site.

It's similar with hash maps: in Rust you derive/implement the Hash and PartialEq trait, in Zig you provide the hash and eq function at the call site.

Each one has pretty stark downsides: Zig - since everything is public, you can't guarantee that your invariants are valid. Anyone can mess around with your internals. Rust - once a field is private (which is the convention), nobody else can mess with the internals. This means outside modules can't access internal state, so if the API is bad, you're pretty screwed.

Honestly, I'm not sure if there is a way to resolve this tension.

EDIT: one more thought: Zig vs Rust also shows up with how object destruction is handled. In Rust you implement a Drop trait, so each object can only have one way to be destroyed. In Zig you use defer/errdefer, so you can choose what type of destructor runs, but this also means you can mess up destruction in subtle ways.

2 comments

> so if the API is bad, you're pretty screwed.

Is this really that big a downside? It encourages good APIs.

The alternative of everything being public is the kind of feature that quickly becomes a big disadvantage in larger systems and teams, where saying “just don’t footgun yourself” is not a viable strategy. If there’s a workaround to achieve some goal, people will use it, and you end up with an unmaintainable mess. It’s why languages whose names start with C feature so prominently on CVE lists.

There are always corner cases where you might need to do something differently. I had three memorable cases in my career: 1. Python 2.6x had a a stdlib bug where windows event logging did crash the process when the user had some rights set differently. Fix submitted but for the meantime we simply overwrote the private function and could ship. 2. Also python: scikit-learn had a primitive "print everything" strategy, but we need to get it into a logging framework. We overwrote their print wrapper and could ship. 3. In C#, a third party lib insisted on dumping a result to a file. We used reflection to get that as a stream.

All three are not ideal - but I think having escape hatches is important. I also think private/public is overrated. Having it as a signal is ok. Forbidding access to privates is too strong.

Three cases in your career doesn't sound like a strong counterargument to me.

I agree that escape hatches can be a good idea, though. But they should be very controlled, e.g. requiring annotations in the code, something that can be reported on by automated tooling and that can't just be done inconspicuously.

Personally, I am comfortable with Pythons "linter warning and we are all adults here" - it works well and I have never seen that somebody cried "I overwrote this private method and after an upgrade it did not work!". .Net allows it via reflection and considering that .Net Frameworks could run untrusted code it was okay that it was forbidden out of the box (since reflection was forbidden for untrusted code). But in the current world, where untrusted code does not really exist anymore? It's just legacy cruft.
The problem is it only takes one bad or incomplete API needed for your specific use case. I ran into this a lot when I used cpal. For example, the data stream enum type (i16, u8, f32, etc) didn't have Hash or Eq derived, so I had to create a wrapper class for the data stream type. But, the type was marked non exhaustive, so I wouldn't be able to tell if my wrapper would get out of sync with theirs. It was a pain to work around.

In other cases, I couldn't work around, so I had to vendor some things. I ended up implementing my own graph library, because the existing one wouldn't let me reach into change some invariants for undo/redo. Which I mean, fair enough if that's what's needed, but it's a real pain to reimplement something because the API was incomplete. And of course, now if something from another library needs petgraph, I'd have to convert my graph to its graph.

So yes, in theory, if we had great APIs this wouldn't be a problem. Unfortunately, APIs are always a work in progress, and sometimes need escape hatches in order to send values between libraries.

Because no one has ever deliberately used the wrong tool for business reasons. Or thought they had a perfectly reasonable argument.

It's better to have escape hatches for in case you need them, but anyone who feels that way probably isn't using Rust to start with.

Maybe that's a bit harsh. I'm sure there are some problem domains where the other trait is desirable, but IMO it's not generic systems programming.

> Zig vs Rust also shows up with how object destruction is handled.

I often hear critiques of Drop being less efficient for anything Arena-like, where batch destruction would be better, and holding that as the reason defer is a better approach. What is not mentioned there is that there's nothing stopping you from having both. In Rust you can perform batch destruction by having additional logic (it might require control of the container and its contents' types for easiest implementation), but the default behavior remains sane.

That's fair, since you can leak the box. I will say though it's not as ergonomic as defer, as defer handles all exits from the scope, where it's trickier to juggle destructors. Though on further thought, I suppose the arena can have Drop.

EDIT: What you can't really do is this: https://github.com/smj-edison/zicl/blob/ea8b75a1284e5bd5a309...

Here I'm able to swap out std.MultiArrayList's backing to be backed by virtual memory, and correctly clean it up. I'm not sure you can really do that with Rust, barring making custom data structures for everything.