Hacker News new | ask | show | jobs
by lamontcg 1082 days ago
Somehow we need a less horrible SemVer or a less horrible social contract around SemVer.
4 comments

Or just use ISO8601-formatted dates as versions and derive huge benefits.

1. version numbers sort numerically and lexicographically in a sensible way, including across projects and packages which use the same format

2. users get educated that these preciously-held ideas they have about software version numbers are complete superstition. Like "something with a zero major number means not production ready", "something with a zero minor number means I should wait until there's a patch", "something with a major number increase means backwards-incompatible", "something with a minor number increase means backwards-compatible"

3. You know when a particular version (of everything) came out. "We started seeing a wierd bug on X date" no longer is impossible to figure out.

I've done this for personal projects for years, for exactly the reasons you state - other than user re-education, my projects don't have enough users to make an impact. For GitHub releases, I get the CI script to do ``git log -1 --format=%cd-%h --date=format:%Y%m%d-%H%M%S'' (producing output along the lines of "20230708-150500-1234567") and use that as the suffix. (Add a -prerelease suffix on if it's coming from a non-default branch.) This sorts nicely and saves me 2 minutes if I need to find the commit in the history.

(These are self-contained projects. I suppose semver does make some sense for libraries that you link with.)

Professionally, it's been 99% Perforce for about 15 years, so it's routine to use the submitted changelist number, submitted changelists being numbered in the order they were subsequently committed. Sadly not fixed-width, but at least Explorer sorts them sensibly.

Two difficulties I have had doing this with git:

- there doesn't seem to be a way to get git to enforce UTC, so the dates are in my local time zone (for my projects this is not really an issue, and my timezone is almost UTC anyway)

- the CI system runs separate builds for different targets, and using the git commit timestamp ensures all builds get the same time stamp. But it's then possible to end up with timestamps significantly different from the actual release time, or (worse) out of order. I could probably do something better about this than my current "solution" of doing nothing, but this has only happened a couple of times

That only works for software with simple linear versioning. If i have two major versions (say 2 and 3), i could still release a minor version to the older major version (so i would release 2.8, then 3.0, then 2.9, then 3.1).
When working with versioning software that requires a version in the format A.B.C, I like to use YYYY.MMDD.N, where N is the number of versions already released on that day.
My rules for 4-number semver:

1) major public API change. If you don't have a public API, this should never be anything but 1 and can be hidden from the user. End users don't want these they scare them.

2) minor: any planned release that doesn't break API. End users love these and plan around them.

3) revision: unplanned emergency hotfixes. Naming this way means the "next minor" we were talking about with all stakeholders is still the next minor. It also means our version numbers look like our git dag, since this one would be a branch from the last tag instead of main.

4) release: sometimes something goes wrong during the release itself. The first 3 numbers are public, this is internal-only and only appears in git tags and internal deployment notes. This way every push to prod has a unique version number, but all our change management documents are still accurate even if we had to push 2 or 3 times for a single release.

Why? First digit is about compatibility. All the other digits are about planning.

"We're working towards 1.3"

"we think this feature will be in 1.4".

"We had to release 1.2.1 because of an emergency somebody put Arabic text in their profile picture filename and that brought down the site."

"Turns out that trick with the release pipeline didn't work in prod so we had to make 1.2.1.1 while deploying".

That at least fixes the qualitative nonsense between minor and patch updates.

I think there needs to be better project definitions around what constitutes a major change.

Projects need to be able to define things like dropping support for old versions of the underlying language in minor versions. So that the last version of support that some people might get is "3.2" and "3.3" may not install at all for them. That means that technically they are in a state where they need to do work to upgrade and are "broken" in a sense, but the actual public API of the software has not changed between "3.2" and "3.3". Supported O/S distro versions should also be able to be abandoned in minor releases. Toolchain updates can also happen in minor releases. Pulling in major versions of dependencies which are technically breaking for anyone who hits a diamond dependency issue, but which produce no major breaking API changes should be able to happen in minor versions.

That means that the contract isn't "I can pull in minor versions and you can never force me to do work" but more strictly that the public API the software exposes won't update.

There's also the problem with semver pinning that projects do where they put hard floor and ceiling pins on all their dependencies, even though their software may be fine with a 5-year old version of the dep (they've just never tested) and it may work fine with the next major release of the dep without any changes at all. Ideally for that last problem, the compatibility matrix fed into the dependency solver should really be a bit more malleable, so that the engineer can realize that the next version of dependency breaks everything and they can retconn the compatibility of their software to pin to the last working version of that dependency. This breaks the perfect immutability of literally everything about a software release, but allows for not being able to predict the future.

What are the horrible things about SemVer? Can you give details?
Semantic Versioning requires you to declare a public API, which is not even remotely possible for many projects. If the public API surface is clear semantic versioning does indeed work well, but otherwise it doesn't give much information as users have no idea what the public API would be. Calendar versioning [1] or even a single-number version is more preferred in such situations.

[1] https://calver.org/

Yes! Thank you. Exactly this :)

Nearly every org I’ve worked in has used semver internally and nearly every time their version numbers were just incremented arbitrarily because there wasn’t an exposed API.

This lead to countless problems, not least of all because semver usually requires one to manually set the version number based on the change log and people are generally pretty bad at changing point releases.

So I’ve usually ended up changing the versioning scheme to build number (generated by the CI/CD tooling) plus some extra information like git hash and/or timestamp - depending on the application and whether that build information can be easily encoded as additional metadata or not.

In my opinion, Semver only makes sense for shared libraries, not for applications, or OSes, or for APIs exposed on the network.

For applications, it goes just like you said.

For APIs on the network, the caller should only get to control the breaking version their request gets routed to (the rest is abstractly owned by the service provider).

> Semver only makes sense for shared libraries, [...]

While it does make more sense for them, that it not even clear as it seems. Library authors rarely define the public API because it is very tedious and hard to complete---there are only some sort of fuzzy and implicit "common sense" definitions. Whenever "breaking" changes happen the definition gets stronger (but still incomplete), and over the time it would encompass every observable aspect, as the Hyrum's Law suggests.

In either case semantic versioning is not tremendously useful because either users have an incomplete expectation of what major, minor and patch versions mean, or they will be notified of every possible change and the version distinction becomes useless. Semantic versioning is still useful because it was a codification of existing practice where the expectation can be good enough to avoid most issues. There is no actual value added by the codification in my opinion.

Many languages have explicit access levels. Others have naming conventions. Library authors use them often.

I think more software follows semantic versioning than before it was codified.

People take it too seriously and don't realize that you can't realistically categorize every single change neatly into 3 separate breakage categories. Arguments abound about how to manage this properly with all sorts of schemes. The fact that "0" is a special case that deserves any consideration is an example of it being broken, imo. What does it actually matter what the first digit is?

Version numbers just denote a change happened and you want them to roughly resemble some sort of chronological ordering. Everything else is gasoline for flame wars and company policies.

To me semver makes only sense if critical bug(/security) fixes will get backported to old major version(s). Otherwise downstream consumers do not really have true choices to make based on the info deduced from semver. Basically if as an upstream your intent is not to support old versions then that heavily implies that everyone should update to latest asap regardless of the brekage.
Even if I always take the latest for direct dependencies, semver is still helpful preventing breakage from incompatible upgrades to indirect dependencies. If I depend on library A, and library A depends on library B, I can't fix any breakage from an incompatible update to library B. I need to wait for library A to update.