Hacker News new | ask | show | jobs
by cle 1596 days ago
Personally I’ve run into more problems with strict enum types in distributed systems in a team setting, than I have with Go’s lack of them. In that setting, strict enums are usually over-strict and eventually you back yourself into a corner in terms of being able to roll out new enum values.

When there’s no clear winner in terms of tradeoffs, I prefer to leave it out of the language like Go has done.

3 comments

> strict enums are usually over-strict and eventually you back yourself into a corner in terms of being able to roll out new enum values.

Yes, they are poison for the evolution of a public API.

Swift solved this problems with [non-exhaustive enums](https://github.com/apple/swift-evolution/blob/main/proposals...).
Is it solved? How do you decide which enums are final and which ones need to be non-exhaustive to allow evolving the code?

I think that Go's decision stems from protocol buffers as they allow to push new values through old binaries, which is a must once you grow enough.

https://developers.google.com/protocol-buffers/docs/proto3#e...

Well, one thing is that enums are non-frozen by default, so you have to actively tag it as `frozen` if you want to put yourself in a scenario where you're never allowed to add cases.

When clients use `switch` on a non-frozen enum from outside its defining module, Swift emits a warning if they don't have an `@unknown default:` case... so consumers of your enum will have to have default logic for handling new cases in order to avoid this warning. (Not for frozen enums though, for frozen ones it's enough to just cover the known cases in calling code, since the expectation will be that you can't update them.)

So basically, if you don't bother thinking much about the problem, you can just avoid adding `frozen` and you'll probably get reasonable behavior where you can add more cases later. Using `frozen` should only be the case if there is some sort of logical impossibility for there to be more cases. Something like how `Optional` has .some and .none, but it's pretty obvious that nobody's going to go add a new case to it (what would a new case even mean?) Same with Result, and probably a bunch of other types I can't think of at the moment.

Also worth noting that Swift treats intra-library code very differently than code that links from another library... if you use your own enums in your own module and don't make them public, it treats them as if they're always frozen... which is nice because it's your internal code and you can always update your own usages without having to worry about compatibility.

> Well, one thing is that enums are non-frozen by default, so you have to actively tag it as `frozen` if you want to put yourself in a scenario where you're never allowed to add cases.

Yeah, non-frozen by default makes a lot of sense. The only gotcha left is that you can't retract from adding frozen, but that's ultimately behavior you want and something that must be able to bite you back.

> if you use your own enums in your own module and don't make them public, it treats them as if they're always frozen... which is nice because it's your internal code and you can always update your own usages without having to worry about compatibility.

That's neat

> which is nice because it's your internal code and you can always update your own usages without having to worry about compatibility.

Unless your services talk to each other or share some external data storage? Which is actually really common?

Services talking to each other, and storage persistence, are only tangentially related to internally defined enums. At some point you need to marshal data in and out of a serialization boundary, and it is at that point that you must handle cases you didn’t anticipate. But it’s just serialized data; it may be intended to represent the same value your enum describes, but it’s up to the deserialization code to do the right thing if it encounters a value it doesn’t recognize.

What I mean is, code that deals with serialization cannot by definition avoid the problem of “what if the data is invalid”. It’s not just enums but every aspect of your type system that must deal with this problem (typically by just throwing an error if the data is invalid, etc.)

That's a fair concern: if everything is strict, then there's no option to incrementally roll out a new value. Maybe a proper enum type could always have an `Unknown` value, which would allow for the leniency while still forcing the use to think about (and handle) it at compile time?
Rust supports #[non_exhaustive] attributes, forcing users to cover the generic/wildcard case even though you have already covered all existing ones. Although, I rather do versioning and a breaking change if possible. Put it on the parsing/interop level rather than deep in the code during runtime because it is very likely that your code is not correct without handling the extra case either way.

https://doc.rust-lang.org/reference/attributes/type_system.h...

Oooor the language can have proper type-safe enumerated types of some sort, and if you're in a domain where that's an issue you don't use them.
Go is focused on the distributed systems domain though. It's fine, even desirable, to have languages focused on particular domains, that make design decisions based on the constraints of the domain. In this domain, closed enums are footguns with costly consequences if you get it wrong.
As someone that doesn't work in that domain, could you give a short example?
You push Thrift clients out into the world expecting to a certain API field to be typed according to a 3-element enum. You add a 4th element to support a new feature in a new client. If you ever accidentally serve this 4th element to an old client, it will crash on deserialization. Bonus points if the client is old enough that it's not part of your testing regime anymore.
I would be okay with an open-by-default enum type. (I think...I'm not sure I've ever encountered a language with open-by-default enums.)

I'm still not sure if it's worth it, it's idiomatic Go to "fall-through" if-branches for default cases, which is the same when checking quasi-enums. The symmetry is nice and makes it very easy to read. But I could be convinced.

What problems did an enum cause you, and how was the enum responsible for the problem?
This is a known issue with Rust, for example. I have an enum with variants A and B. Somebody writes an exhaustive switch (match statement) that handles A and B with no default case. I add a variant C. Their code breaks because they don’t handle C. Adding an enum variant was a breaking change.

In Rust, the answer is #[non_exhaustive], which forces consumers to always add a default case. It’s not a huge deal, just a known issue with a well-understood solution.

I guess I don't quite get this. What I'm seeing here is that a non obvious breaking change is being turned into an obvious one.

If something is handling A and B, but you add C, the code probably needs to make sure it's handling C correctly.

I use Java in my dayjob and the behavior you've outlined is how I always code things, but it's manual and doesn't happen at compile time: I provide default that throws a runtime exception.

If your package updates and users of your package update and their code ceases to compile, that seems... fine? It's the system working as intended. They can just downgrade back to the previous known good version. It would be much worse if you made a breaking change to code but consumers' code that used yours continued to compile but no longer functioned as expected
IME the problem is the default behavior. Rust, Java, et al have that same default behavior of defaulting to closed enums, and you have to opt-in to open enums. Whether that's adding a type attribute in Rust or implementing special code in Java to handle the case. This is a footgun for distributed systems. If you don't get it wrong, then clients writing their own client-side software will get it wrong.

I'm not disparaging closed enums, they are very useful in certain contexts, but they make it really easy to do the wrong thing when reading data off the wire. Given Go is focused on this exact domain (distributed systems), I am glad the language doesn't have them.