Hacker News new | ask | show | jobs
by haberman 384 days ago
> I feel like I can write powerful code in any language, but the goal is to write code for a framework that is most future proof, so that you can maintain modular stuff for decades.

I like Zig a lot, but long-term maintainability and modularity is one of its weakest points IMHO.

Zig is hostile to encapsulation. You cannot make struct members private: https://github.com/ziglang/zig/issues/9909#issuecomment-9426...

Key quote:

> The idea of private fields and getter/setter methods was popularized by Java, but it is an anti-pattern. Fields are there; they exist. They are the data that underpins any abstraction. My recommendation is to name fields carefully and leave them as part of the public API, carefully documenting what they do.

You cannot reasonably form API contracts (which are the foundation of software modularity) unless you can hide the internal representation. You need to be able to change the internal representation without breaking users.

Zig's position is that there should be no such thing as internal representation; you should publicly expose, document, and guarantee the behavior of your representation to all users.

I hope Zig reverses this decision someday and supports private fields.

16 comments

I disagree with plenty of Andrew's takes as well but I'm with him on private fields. I've never once in 10 years had an issue with a public field that should have been private, however I have had to hack/reimplement entire data structures because some library author thought that no user should touch some private field.

> You cannot reasonably form API contracts (which are the foundation of software modularity) unless you can hide the internal representation. You need to be able to change the internal representation without breaking users.

You never need to hide internal representations to form an "API contract". That doesn't even make sense. If you need to be able to change the internal representation without breaking user code, you're looking for opaque pointers, which have been the solution to this problem since at least C89, I assume earlier.

If you change your data structures or the procedures that operate on them, you're almost certain to break someone's code somewhere, regardless of whether or not you hide the implementation.

Most data structures have invariants that must hold for the data structure to behave correctly. If users can directly read and write members, there's no way for the public APIs to guarantee that they will uphold their documented API behaviors.

Take something as simple as a vector (eg. std::vector in C++). If a user directly sets the size or capacity, the calls to methods like push_back() will behave incorrectly, or may even crash.

Opaque pointers are one way of hiding representation, but they also eliminate the possibility of inlining, unless LTO is in use. If you have members that need to be accessible in inline functions, it's impossible to use opaque pointers.

There is certainly a risk of "implicit interfaces" (Hyrum's Law), where users break even when you're changing the internals, but we can lessen the risk by encapsulating data structures as much as possible. There are other strategies for lessening this risk, like randomizing unspecified behaviors, so that people cannot take dependencies on behaviors that are not guaranteed.

> Most data structures have invariants that must hold for the data structure to behave correctly. If users can directly read and write members, there's no way for the public APIs to guarantee that they will uphold their documented API behaviors.

You can, just not in the "strictly technical" sense. You add a "warranty void if these fields are touched" documentation string.

That's honestly horrible. It's like finding your job is guaranteed by a pinkie promise, or the equivalent.
Most of the world runs on a handshake.
That's not a valid argument. For most of human existence there was cannibalism and/or human sacrifices. This doesn't mean we should go back to it.
isn't that the norm in many places on earth?
I prefer liability when devs misuse software with consequences for society infrastructure.
A language adding private fields does not add liability.
Indeed, misusing the library and causing software faults does, so every stone in the way preventing misuse helps.
> I've never once in 10 years had an issue with a public field that should have been private, however I have had to hack/reimplement entire data structures because some library author thought that no user should touch some private field.

Very similar experience here. Also just recently I really _had_ to use and extend the "internal" part of a legacy library. So potentially days or more than a week of work turned into a couple of hours.

Like unclad, I disagree that not having private fields is a problem. I think this comes down to programming style. For an OOP style (Just one example), I can see how that would be irritating. Here's my anecdote:

I write a lot of rust. By default, fields are private. It's rare to see a field in my code that omits the `pub` prefix. I sometimes start with private because I forget `pub`, but inevitably I need to make it public!

I like in principle they're there, but in practice, `pub` feels like syntactic clutter, because it's on all my fields! I think this is because I use structs as abstract bags of data, vice patterns with getters/setters.

When using libraries that rely on private fields, I sometimes have to fork them so I can get at the data. If they do provide a way, it makes the (auto-generated) docs less usable than if the fields were public.

I suspect this might come down to the perspective of application/firmware development vice lib development. The few times I do use private fields have been in libs. E.g. if you have matrix you generate from pub fields and similar.

One the key principles for modular software is encapsulation, it predates OOP by decades, and at least even C got that correct.
This is only a problem if you can't modify the library you're using for whatever reason (usually a bad one). If you have the source of all your dependencies, you can just fork and add methods as needed in the rare cases where you need to do this.
Some years ago I started to just not care about setting things to "private" (in any language). And I care _a lot_ about long term maintainability and breakage. I haven't regretted it since.

> You cannot reasonably form API contracts (...) unless you can hide the internal representation.

Yes you can, by communicating the intended use can be made with comments/docstrings, examples etc.

One thing I learned from the Clojure world, is to have a separate namespace/package or just section of code, that represents an API that is well documented, nice to use and more importantly stable. That's really all that is needed.

(Also, there are cases where you actually need to use a thing in a way that was not intended. That obviously comes with risk, but when you need it, you're _extremely_ glad that you can.)

I have the opposite experience. Several years ago I didn't worry too much about people using private variables.

Then I noticed people were using them, preventing me from making important changes. So I created a pseudo-"private" facility using macros, where people had to write FOOLIB_PRIVATE(var) to get at the internal var.

Then I noticed (I kid you not) people started writing FOOLIB_PRIVATE(var) in their own code. Completely circumventing my attempt to hide these internal members. And I can't entirely blame them, they were trying to get something done, and they felt it was the fastest way to do it.

After this experience, I consider it an absolute requirement to have a real "private" struct member facility in a language.

I respect Andrew and I think he's done a hell of a job with Zig. I also understand the concern with the Java precedent and lots of wordy getters/setters around trivial variables. But I feel like Rust (and even C++) is a great counterexample that private struct variables can be done in a reasonable way. Most of the time there's no need to have getters/setters for every individual struct member.

It's about the contract with the users. I don't think you should worry about breaking someone using the private fields of your classes. Making a field private, for example by prefixing an underscore in Python, tells the users "for future maintainability of the software I allow myself the right to change this field without warning, use at your own peril".

If you hesitate changing it because you worry about users using it anyway you are hurting the fraction of your users who are not using it.

This is company code in a monorepo. If a change breaks users, it will simply be rolled back.

Everyone is brainstorming ways to work around Zig's lack of "private". But nobody has a good answer for why Zig can't just add "private" to the language. If we agree that users shouldn't touch the private variables, why not just have the language enforce it?

> If we agree that users shouldn't touch the private variables, why not just have the language enforce it?

Thing is, I don't have an opinion about what users should do. That's entirely up to them and the trade offs they make in their contexts. There are scenarios where you might want to access a private field.

But it's also a question about simplicity, adding private to the language makes it bigger without imo contributing anything of practical value that can't be achieved with convention.

Because sometimes the user really wants to access those fields, and if the language enforces them being private, the user will either copy-paste your code into their project, or fork your project and make the fields public there. And now they have a lot of extra work to stay up-to-date when compared to just making the necessary changes if those fields ever change had they been public.
I would be satisfied if the language supported this use case by offering a “void my warranty” annotation that let a given source file access the privates of a given import.

Companies with monorepos could easily just ban the annotation. OSS projects could easily close any user complaint if the repro requires the annotation.

This seems like a great compromise to me. It would let you unambiguously mark which parts of the api are private, in a machine checkable way, which is undoubtedly better than putting it into comments. But it would offer an escape hatch for people who don’t mind voiding their warranty.

> or fork your project

If they want to ignore the API contract then that's the right response. The maintainer chose one thing to preserve their ability to provide non-breaking updates. The user doesn't care about that, now it's on them to maintain that code which they're sinking their probes into.

That is the beauty of binary libraries, they enforce encapsulation.
I started using Boost's approach, that is keep those things public but in their own clearly-named internal namespace (be it an actual namespace or otherwise).

This way users can get to them if they really need to, say for a crucial bug fix, but they're also clearly an implementation detail so you're free to change it without users getting surprised when things break etc.

> And I can't entirely blame them

You can't blame them, but they can't blame you if you break their code.

> Then I noticed (I kid you not) people started writing FOOLIB_PRIVATE(var) in their own code.

If it’s in an internal monorepo, this should be super easy to fix using grep.

Honestly it sounds like a great opportunity to improve your API. If people are going out of their way to access something that you consider private, it’s probably because your public APIs aren’t covering some use case that people care about. That or you need better documentation. Sometimes even a comment helps:

    int _foo; // private. See getFoo() to read.
I get that it’s annoying, but finding and fixing internal code like this should be a 15 minute job.
That's pretty much why I never bother with the underscore prefix convention when using python. If someone wants to use it they'll do it anyway.
C++ precedent though, getters and setters were widely adopted in C++ frameworks before Java was even an idea.
> After this experience, I consider it an absolute requirement to have a real "private" struct member facility in a language.

I think that's the wrong take to have. Life is much easier when you accept the reality of a world where people will do whatever they want with what you give them.

C++ has private, and so what? I've seen #define private public or even -Dprivate=public, I've seen classes with private implementation detail reimplemented with another name and all fields public & then casted, I've seen accessing types as char arrays and binary operations to circumvent this, I've seen accessing the process raw memory pages. If someone other than you can call the code, it's not yours anymore to decide what can be done with it.

What you don't owe anyone is the guarantee of things working if people stray from the happy path you outline - they want help after going astray, give them your hourly rate on fixing their mistakes.

> The idea of private fields and getter/setter methods was popularized by Java, but it is an anti-pattern.

I agree with this part with no reservations. The idea that getters/setters provide any sort of abstraction or encapsulation at all is sheer nonsense, and is at the root of many of the absurdities you see in Java.

The issue, of course, is that Zig throws out the baby with the bath water. If I want, say, my linked list to have an O(1) length operation, i need to maintain a length field, but the invariant that list.length actually lines up with the length of the list is something that all of the other operations need to maintain. Having that field be writable from the outside is just begging for mistakes. All it takes is list.length = 0 instead of list.length == 0 to screw things up badly.

You can have a debug time check.
Just prefix internal fields with underscore and be a big boy and don't access them from the outside.

If you really need to you can always use opaque pointers for the REALLY critical public APIs.

I am not the only user of my API, and I cannot control what users do.

My experience is that users who are trying to get work done will bypass every speed bump you put in the way and just access your internals directly.

If you "just" rely on them not to do that, then your internals will effectively be frozen forever.

Or you change it and respond with “You were warned”.

I seriously do not get this take. People use reflection and all kinds of hacks to get at internals, this should not stop you from changing said internals.

There will always be users who do the wrong thing.

Let's say I'm in a large company. Someone on some other team decided to rely on my implementation internals for a key revenue driver, and snuck it through code review.

I can't break their app without them complaining to my boss's boss's boss who will take their side because their app creates money for the company.

Having actual private fields doesn't 100% prevent this scenario, but it makes it less likely to sneak through code review before it becomes business-critical.

You can still create modules in zig, just use the standard handle pattern as you might in c/c++. I think that many of us have worked in “large company”, and the issue you describe is not resolved with the “private” keyword. You need to make your “component/module” with a well defined boundary (normally dll/library), a “public interface” and the internals not visible as symbols.

That doesn’t save you in languages that support reflection, but it will with zig. Inside a module, all private does is declare intent.

In languages with code inheritance, I think inheritance across module boundaries is now widely viewed as the anti-pattern that it is.

> If you "just" rely on them not to do that, then your internals will effectively be frozen forever.

Or they will be broken when you change them and they upgrade. The JavaScript ecosystem uses this convention and generally if a field is prefixed by an underscore and/or documented as being non-public then you can expect to break in future versions (and this happens frequently in practice).

Not necessarily saying that's better, but it is another choice that's available.

[flagged]
Not everyone has to follow the MS approach of not breaking clients that rely on “undocumented” behavior. Document what will not be broken in future, change the rest and ignore the wailing.

It’s antithetical to what Zig is all about to hide the implementation. The whole idea is you can read the entire program without having to jump through abstractions 10 layers deep.

> You cannot reasonably form API contracts (which are the foundation of software modularity) unless you can hide the internal representation

Python is a good counter example IMHO, the simple convention of having private fields prefixed with _/__ is enough of a deterrent, you don't need language support.

> You need to be able to change the internal representation without breaking users.

Unless the user only links an opaque pointer, then just changing the sizeof() is breaking, even if the fields in question are hidden. A simple doc comment indicating that "fields starting with _ are not guaranteed to be minor-version-stable" or somesuch is a perfectly "reasonable" API.

I'd imagine semantic versioning to be more subjective with a language that relies on a social contract, because if a user chooses to use those private fields, a minor update or patch could break their code.

It does feel regressive to me. I've seen people easily reach for underscored fields in Python. We can discourage them if the code is reviewed, but then again there's also people who write everything without underscores.

The chance of someone relying on the size at an API level is extremely small. That's far less risky than exposing every field.
> Zig is hostile to encapsulation. You cannot make struct members private

In Zig (and plenty of other non-OOP languages) modules are the mechanism for encapsulation, not structs. E.g. don't make the public/private boundary inside a struct, that's a silly thing anyway if you think about it - why would one ever hand out data to a module user which is not public - just to tunnel it back into that same module later?

Instead keep your private data and code inside a module by not declaring it public, or alternatively: don't try to carry over bad ideas from C++/Java, sometimes it's better to unlearn things ;)

Why would you hand out data that gets tunneled back in?

There are lots of use cases for this exact pattern. An acceleration structure to speed up searching complex geometry. The internal state of a streaming parser. A lazy cache of an expensive property that has a convenient accessor. An unsafe pointer that the struct provides consistent, threadsafe access patterns for. I've used this pattern for all these things, and there are many more uses for encapsulation. It's not just an OO concern.

I think the bigger issue with "public" and "private" is that is insufficiently granular, being essentially all or nothing. The use of those APIs in various parts of the code base is not self-documenting. Hyrum's Law is undefeated.

C++ has the PassKey idiom that allows you to whitelist what objects are allowed to access each part of the public API at compile-time. This is a significant improvement but a pain to manage for complex whitelists because the language wasn't designed with this in mind. C++26 has added language features specifically to make this idiom scale more naturally.

I'd love to see more explicit ACLs on APIs as a general programming language feature.

> I'd love to see more explicit ACLs on APIs as a general programming language feature.

In that I agree, but per-member public/private/protected is a dead end.

I'd like a high level language which explores organizing all application data in a single, globally accessible nested struct and filesystem-like access rights into 'paths' of this global struct (read-only, read-write or opaque) for specific parts of the code.

Probably a bit too radical to ever become mainstream (because there's still this "global state == bad" meme - it doesn't have to be evil with proper access control - and it would radically simplify a lot of programs because you don't need to control access by passing 'secret pointers' around).

Concur. Or, the in-between: Set the structs to be private if you need. I make heavy use of private structs and modules, but rarely private fields.
The solution to this is to simply put an underscore before the variables you don't think others should rely on, then move on with your life.
I think I mostly agree, but I do have one war story of using a C++ library (Apache Avro) that parsed data and exposed a "get next std::string" method. When parsing a file, all the data was set to the last string in the file. I could see each string being returned correctly in a debugger, but once the next call to that method was made, all previous local variables were now set to the new string. Never looked too far into it but it seemed pretty clear that there was a bug in that library that was messing with the internals of std::string, (which if I understand is just a pointer to data). It was likely re-using the same data buffer to store the data for different std::string objects which shouldn't be possible (under the std::string "API contract"). It was a pain to debug because of how "private" std::string's internals are.

In other words, we can at best form API contracts in C++ that work 99% of the time.

FWIW, the std::string buffer is directly accessible for (re-)writing via the public API. You don't need to use any private access to do this.
I believe private fields are a feature that actually increases the expressivity of a language, as per the formal definition. This one can't be replaced by some trivial, local syntactic sugar.

Of course increasing expressivity is not the end goal in itself for a PL, but I do agree with you that this (and some other, like no unused variable - that one drives me up a wall) design choice makes me less excited about the language as I would otherwise be.

You're getting a lot of responses with very strong opinions from people who talk as if they've never had to care about customers relying on their APIs.
It’s a trust thing.

If you can trust that downstream users of your api won’t misuse private-by-convention fields (or won’t punish you for doing so), it’s not a problem. That works a lot of the time: You can trust yourself. You can usually your team. In the opensource world, you can just break compatibility with no repercussions.

But yes, sometimes that trust isn’t there. Sometimes you have customers who will misuse your code and blame you for it. But that isn’t the case for all code. Or even most code.

Andrew has so many wrong takes. Unused variables is another.

Such a smart guy though, so I'm hesitant to say he's wrong. And maybe in the embedded space he's not, and if that's all Zig is for then fine. But internal code is a necessity of abstraction. I'm not saying it has to be C++ levels of abstraction. But there is a line between interface and implementation that ought to be kept. C headers are nearly perfect for this, letting you hide and rename and recast stuff differently than your .c file has, allowing you to change how stuff works internally.

Imagine if the Lua team wasn't free to make it significantly faster in recent 5.4 releases because they were tied to every internal field. We all benefited from their freedom to change how stuff works inside. Sorry Andrew but you're wrong here. Or at least you were 4 years ago. Hopefully you've changed your mind since.

> I'm not saying it has to be C++ levels of abstraction. But there is a line between interface and implementation that ought to be kept. C headers are nearly perfect for this, letting you hide and rename and recast stuff differently than your .c file has, allowing you to change how stuff works internally.

Can’t you do this in Zig with modules? I thought that’s what the ‘pub’ keyword was for.

You can’t have private fields in a struct that’s publicly available but the same is sort of true in C too. OO style encapsulation isn’t the only way to skin a cat, or to send the cat a message to skin itself as the case may be.

I don't know Zig so I dunno maybe
I agree with almost all of this, including the point about c header files, except that code has to be in headers to be inlined (unless you use LTO), which in practice forces code into headers even if you’d prefer to keep it private.
There's nothing wrong with using LTO, but I prefer simply compiling everything as a single translation unit ("unity builds"), which gets you all of the LTO benefits for free (in the sense that you still get fast compile times too).
> But internal code is a necessity of abstraction

I just fundamentally disagree with this. Not having "proper" private methods/members has not once become a problem for me, but overuse of them absolutely has.

From my understanding, making stable API is impossible in Zig anyway, since Zig itself is still making breaking changes at the language level
How is this any different than Python or Ruby? You can access internals easily and people don't have a problem writing maintainable modular software in those languages.

Not to mention just about every language offers runtime reflection that let's you do bad stuff.

IMO, the Python adage of "We are all consenting adults here" applies.

I don't care about public/private.
You are right. Don't listen to the idiots!