Hacker News new | ask | show | jobs
by xja 3443 days ago
The author is essentially saying any API change has the potential to break backward compatibility and that we should define what kind of breakage is ok.

That's kind of interesting, and I'd not considered many of the scenarios mentioned (which apply beyond go).

Perhaps the "correct", but cumbersome thing to do is supply different versions of the API when maintaining backward compatibility and never change the old version.

5 comments

Most of these are treated as backwards incompatibilities by the stdlib folks and most users.

The one that's not is that adding a field or method could break importing code. The scenario here is that you embed two types, one from the upgrading lib and one not, and the upgrading lib adds a name that collides with one in the other type. Rather than pick a winner in that conflict (last embed in the type definition wins, say), Go raises an error so the human can explicitly resolve it by, e.g. renaming one of the conflicting names if you control the code (gorename helps!) or changing one of the embeds to a regular member.

This can happen, but I haven't seen it occur in the time I've been around the Go community, and never adding methods doesn't make sense, and silently resolving those conflicts could _still_ cause a backcompat issue and seems worse than the status quo, and if you do hit the conflict it's often resolvable (gorename!), so I think calling it backwards compatible to add a method/field is reasonable (as well as what everyone already does).

The practical backwards compatibility things that I have seen come up tend to have less to do with folks hitting cases like that than changes that are theoretically right but expose code that was buggy but happened to work before, e.g. programs that used to rely on racy map accesses or wrong cgo pointer usage or invalid Less functions in sort working, or trickiness around net protocol interfaces and buggy clients, or parallel tests exposing something.

> The practical backwards compatibility things that I have seen come up

Yes, I'm not really talking about practical problems :) This article is mainly a response to various people claiming, "backwards compatibility is easy, it's a breakage if you are breaking the build".

FWIW, most of the scenarios you mention I wouldn't really call a backwards compatibility issue either, though. I'd say they are bugs that are exposed by a change in API.

Yeah. I'm trying to say, sure, that's a clever note on what can, in very precise circumstances, happen whenever any field/method is added. But as working engineers, focusing on the practical problems is, well, the practical thing to do.

If what you're thinking is "well in that case, people should agree that adding methods/fields is OK," I think they do, and, e.g. https://golang.org/doc/go1compat spells out that methods may be added and how that can interact with multiple embedding.

Practically, I don't want my upstreams to stop adding methods or fields to structs or to create an extra step they have to go through before adding one. That slows down upstream dev; it's not a net win for me as a downstream or a cost I want to impose on (sometimes-unpaid) OSS maintainers.

I also don't want a tool to flag every upstream field/method addition as a breaking change, because that's noise to me in the common case where the change doesn't break _my_ program. Really, if I want a back-compat test, I should upgrade and try to run my code's tests since that can shake out many more kinds of break, including ones due to my bugs/my dependencies on undoc'd stuff. I should do it whether the API was tweaked or not, because any change can change behavior.

Again, you did make a clever observation of a potential build break; I'm just saying I don't think that Go library authors need to spend more time worrying about potential name collisions with multiple embedding, or that people should specifically build workflows and tools around it.

> But as working engineers, focusing on the practical problems is, well, the practical thing to do.

Well, yeah, I'm an engineer too (just educated as a mathematician) :)

FWIW, I tried to solve an engineering problem, namely "how can we get the advantages of SemVer, while working around human deficiencies in setting and maintaining them". Or "why should a human need to figure out a version number, if a computer can do it for me". But, the thing is, that this turned out not to actually be an engineering problem, but a political problem; if there is no obviously correct interpretation of "breaking change", then for a tool to be acceptable, you'd need to get people to accept it's limited implementation of "breaking". I simply wasn't willing to put up with this political challenge (others where) :)

> If what you're thinking is "well in that case, people should agree that adding methods/fields is OK," I think they do, and, e.g. https://golang.org/doc/go1compat spells out that methods may be added and how that can interact with multiple embedding.

True. AFAIR that section was added around the time I wrote that article (I believe it was in tip a couple of days beforehand).

> Practically, I don't want my upstreams to stop adding methods or fields to structs or to create an extra step they have to go through before adding one. That slows down upstream dev; it's not a net win for me as a downstream or a cost I want to impose on (sometimes-unpaid) OSS maintainers.

I agree.

> I also don't want a tool to flag every upstream field/method addition as a breaking change, because that's noise to me in the common case where the change doesn't break _my_ program. Really, if I want a back-compat test, I should upgrade and try to run my code's tests since that can shake out many more kinds of break, including ones due to my bugs/my dependencies on undoc'd stuff

But you, as the author of a package, are not really the target audience either. You are able to fix compilation bugs, but your users likely are not.

But yeah, you probably also just aren't the addressee of my article. It is mostly addressed to people claiming that versioning in go is a solved problem and SemVer is the solution. It's not.

Your observation is pretty much the point of the article; a breakage happens, iff my code doesn't work anymore with an upgraded dependency, not more, nor less.

> Again, you did make a clever observation of a potential build break; I'm just saying I don't think that Go library authors need to spend more time worrying about potential name collisions with multiple embedding, or that people should specifically build workflows and tools around it.

Here, I disagree. Tooling to work around breakages would be excellent. As you mentioned yourself, for most breakages you just have to make the compiler happy and for most breakages it's pretty trivial to figure out what's needed. And at that point, there really should be a tool to do the job; after all, you should never send a human to do a computer's job. :)

> I'm an engineer too

Yep! Point is, we all are, so we all have to worry about how things break in practice, not only about the theoretical model of compatibility.

And it turns out we break each others' code lots of ways, sometimes even just changing behavior without touching the API. If I had a tool that detected certain build-time problems and worked around others, I could still end up upgrading to a new version of a library that breaks my product. So we end up with vendoring and such where product maintainers sort it out, and probably will continue to need some humans in the loop as long as releases have bugs.

I do think there's plenty to do on the larger issue of compatibility even if not specifically focused on this particular build-time break. There are tools I'd love to see, e.g. to test my program with its deps updated and maybe even do something git-bisect-ish to find just where things went wrong. (Node has something along those lines named Greenkeeper.) Peter Bourgon's got a group working on Go package management, and there's been work done, both practical (e.g. the tools and practices at https://peter.bourgon.org/go-best-practices-2016/#dependency...) and theoretical (https://research.swtch.com/version-sat). Useful progress and interesting stuff.

> Perhaps the "correct", but cumbersome thing to do is supply different versions of the API when maintaining backward compatibility and never change the old version.

This is the approach taken in good software deployments, specially if you cannot control who the users are.

Then what's the purpose of the promise in the first place? That's just having the old APIs and compilers with the new optimisations (at least what I understood from your suggestion).
The main point of the article to me is "adding fields to Go structs is a breaking change" since your users could be using implicit struct initializer syntax, which will cause a compilation error. They have listed other points too, but I think this one can be highlighted.
Using the unkeyed initializer syntax is discouraged anyhow: https://golang.org/cmd/vet/#hdr-Unkeyed_composite_literals

Unkeyed syntax also doesn't work for a struct in another package with at least one private field.

One common exception is a vector type that isn't expected to change--RGBA, XYZ, LatLon, etc.

The issue is that even if you only consider keyed initializers, adding fields is a breaking change, strictly speaking. I tried making clear that I use the compatibility guarantee as a basis in the introduction and I mention this exception more thoroughly in the section about adding fields.
There's no need to reinvent the wheel. Semantic versioning already defines a standard for managing version changes in a pragmatic way.

http://semver.org/

I think you either did not read the article or missed the point. Author says if you make _seemingly innocent non-breaking changes_ you might think it does not require a major version change in SemVer but in fact (for instance in case of adding struct fields) it does.
Yes but only in Go. There are projects that have successfully evolved APIs in not only source compatible but binary compatible ways over a period of decades. Win32 and Java are the obvious candidates.

Not changing or deleting existing functions and fields is easy. The problem is that Go apparently doesn't let you reliably add things either. That is kind of a joke, how is anyone expected to be able to evolve an API if you can't even add things?

Seems like the consensus is "well it doesn't happen often so it's not an issue" which is weak. Go isn't even trying for binary compatibility, the article is only talking about source compat!

> Yes but only in Go.

I don't believe this to be true. I'd say at least most languages/projects would have exactly the same kind of problems (some of them at runtime, some of them at compile time), it's just that in practice they don't really matter. So you don't know about it and it seems, to a casual observer, to work just fine.

> The problem is that Go apparently doesn't let you reliably add things either.

Neither does, at least, C. I'd guess that python also doesn't let you do it. I don't know enough about java, but I'd assume that it has the same problems.

What you need to understand is, that I applied an extremely nit-picky interpretation of stability. It's not like any of the things I mentioned are learned from experience or practical problems; I just understand the go type system enough and thought really hard whether there is an obviously correct and useful interpretation of API stability under its constraints and came up with a (mostly theoretical) result. Most projects in go that use SemVer apply a more generous and naive notion that works well enough in practice. C projects do exactly the same.

doesn't that mean you thought wrong, and it is in fact a breaking, major-version change? that seems fine - it's a signal to your consumers that it may not work out-of-the-box.
What I was trying to illustrate is, that the notion of versioning is broken in and off itself. By the lessons of the article, pretty much every API change is a breaking change, so you would constantly need to increment the major version, if you take SemVer seriously, meaning minor versions don't exist, de facto. And if you now imagine that you'd need to touch your code every time one of your dependencies increments their minor version (I mean. Likely you can't, because you have no notion of how often that happens, because currently all your tools just ignore those changes), lest people can't build your software because the packaging tool needs to assume an incompatibility that needs manual resolving, you will see how useless this makes versioning.

Now, there are two ways out of this mess: One is to ignore it and just assume that, in practice, some things will are more likely to break consumers than others and just apply a reasonable case-by-case judgement. It's what's happening right now in probably ~every language for ~every tool out there. I think it's reasonable, but I personally dislike it, because for one, humans eff up all the time, so relying on them having a good notion of what breaks and applying it consistently and timely leads to pain. This whole article is born out of the idea, that these things should be codified and then automatically applied, I shouldn't even have to need to know what version my package currently is, IMHO.

The other way out is much more complicated: Transition to a notion of breakage not by versioning APIs, but by defining it in terms of pairs of packages. This also has a bunch of definite and obvious deficiencies (for example that you don't have access to all the code that imports you. Or the combinatoric explosion).

Currently, my personal hope is that this can be solved by supporting gradual code repair (see, e.g. https://github.com/golang/go/issues/18130 for what this means and how this is currently progressing) and then add good tooling (I'm working on something, but I have limited time and brain space). We'll see :)

A good chunk of that is that Go, by design, makes nearly all changes breaking (as you covered pretty well). I doubt they did so intentionally, but they seem to focus their decisions on small-ish scale projects at the expense of large / longer-term ones, and this is the natural consequence.

I think the focus here makes sense, and improves lots of useful things in practice (which is why they do it - Go focuses on pragmatism), I just don't really like it. Every non-side-project I've worked on has chafed under weaknesses that Go seems to embrace, because the code has had to survive and grow for a couple years. It's a sizable step up from Python tho.

As far as ways out of this mess... not sure. A lot of the problems are solved by "hit it until it compiles", which is a good thing, and often implies automated code-rewriting tools are possible. The rest (adding methods -> you may collide with an interface which you didn't before) can probably be detected so you at least have your potential-problems enumerated. There are some fairly sophisticated tools out there for doing both of these, e.g. https://github.com/facebook/codemod , and it'd be great to see more language-communities embrace (and improve) them IMO.

If you manage to limit most of your changes to "can be automatically changed / detected", you have a fair bit more freedom. With Go's limitations... maybe enough? I'd have to read and think a lot harder to figure out if there would be too many things that fall into those gaps.

> they seem to focus their decisions on small-ish scale projects at the expense of large / longer-term ones, and this is the natural consequence.

This is - excuse me - a pretty ridiculous claim, given the explicit design goals of go. https://talks.golang.org/2012/splash.article You might disagree, that their choices are furthering these design goals (I don't), but claiming that they focus on small-ish projects is just non-factual, most design decisions are driven by a focus on large (both in problem- and code size) projects.

> A lot of the problems are solved by "hit it until it compiles"

Neither do I want to need to fix compilation errors in software that I use, nor do I want my users to have to do it. This is not a solution to the problems at hand.

> The rest (adding methods -> you may collide with an interface which you didn't before) can probably be detected

No, they can't. From looking at your code you only get an outwards pointing import-arrow. There is no way for you to know which code actually relies on your API and in what way. You might, say, limit your guarantees to packages on godoc.org (which is indeed what I do in practice), but it's still hardly a solution to the problem.

> If you manage to limit most of your changes to "can be automatically changed / detected", you have a fair bit more freedom.

This is indeed the path I'm currently following; for every breaking change, provide an automatic fix. It still isn't an actual solution, though, just a halfway decent workaround. It especially doesn't work without https://github.com/golang/go/issues/18130 and a couple of other needed changes.

Luckily, go has immensely powerful tools to perform these kinds of rewrites (much more powerful than any other language I'm aware of) and I'm thus pretty optimistic that this will provide a good way forward soon (now that the need for gradual repair is officially recognized).

If you follow semver, this article tells you when you need to bump the major version versus the minor. The gist is the article is that you almost always need to bump the major version.