Nov 10, 2017
The cargo cult of versioning

All software comes with a version, some sequence of digits, periods and characters that seems to march ever upward. Rarely are the optimistically increasing versions accompanied by a commensurate increase in robustness. Instead, upgrading to new versions often causes regressions, and the stream of versions ends up spawning an extensive grapevine to disseminate information about the best version to use. Unsatisfying as this state of affairs is to everyone, I didn't think that the problem lay with these version numbers themselves. They're just names, right? However, over the past year I've finally had my attention focused on them, thanks to two people:

  • Rich Hickey pointed out last year that the convention of bumping the major version of a library to indicate incompatibility conveys no more actionable information than just changing the name of the library.

  • I recently encountered this post from 2012 by Steve Losh, pointing out that if version numbers were any good, we'd almost always be looking to use the latest version number. But that doesn't happen. In the real world we're constantly trying to hold the stream of versions back, to clamp version numbers to some ceiling. Pinning our systems to use a specific version or range of versions.
  • Why is version pinning so prevalent? The proximal reason is that modern package managers uniformly1 fail to provide the sane default of "give me the latest compatible version, excluding breaking changes."

  • RubyGems defaults to newest version available. So if you don't specify a version for a dependency and they create a breaking v3.0 when you've been implicitly using v2.2, boom. Result: best practice of pinning major version using the twiddle-waka or pessimistic operator all over the place, and Steve Losh shaking head sadly.

  • NPM for Node.js defaults to version tagged "latest". Again, if you go with the default version your project is liable to go boom at some future date. Best practice is to pin major versions using either the tilde operator or tags. Cue more sadness for Steve Losh. [Edit: The fact that you're required to provide some sort of version string at least raises the question of what it should be. It's still a gotcha that the empty string "" means "latest", but it's mitigated by the use of npm install --save to record dependencies rather than editing package.json directly.]

  • Leiningen for Clojure once again defaults to the latest version. [Correction: a comment pointed out that Clojure has no default, but always requires a version. This approach suffers from the opposite problem: you never get any bugfixes or security fixes unless you explicitly mess with package.clj.]

  • …and so on.
  • These are all deep, competent projects. Why are their defaults so uniformly useless and misleading? The underlying reason is the traditional format of version numbers: mashing together multiple numbers into a single string, and more importantly separating the version string from the name of a package. A dependency that is just a name provides no hint on what version you want compatibility with, so a package manager has no easy way to pick a good default version.

    Towards a better approach

    To begin with, it's weird that versions are strings. Parsing versions is non-trivial. Let's just make them a tuple. Instead of "3.0.2", we'll say "(3, 0, 2)".

    Next, move the major version to part of the name of a package. "Rails 5.1.4" becomes "Rails-5 (1, 4)". By following Rich Hickey's suggestion above, we also sidestep the question of what the default version should be. There's just no way to refer to a package without its major version.

    Since we always want to provide the latest version by default, the distinction between minor versions and patch levels is moot. Just combine the 2-tuple into a single number. "LeftPad 17.5.0" would now become something like "LeftPad-17 37".

    At this point you could even get rid of the version altogether and just use the commit hash on the rare occasions when we need a version identifier. We're all on the internet, and we're all constantly running npm install or equivalent. Just say "Leftpad-17", it's cleaner.

    And that's it. Package managers should provide no options for version pinning.

    A package manager that followed such a proposal would foster an eco-system with greater care for introducing incompatibility2. Packages that wantonly broke their consumers would "gain a reputation" and get out-competed by packages that didn't, rather than gaining a "temporary" pinning that serves only to perpetuate them. The occasional unintentional breakage would necessitate people downstream cloning repositories and changing dependency URLs, which would create a much more stringent atmosphere of accountability for the breaking package. As a result, breaking changes wouldn't live so long that they gain new users.

    In particular, Semantic Versioning is misguided, an attempt to fix something that is broken beyond repair. The correct way to practice semantic versioning is without any version strings at all, just Rich Hickey's directive: if you change behavior, rename. Or ok, keep a number around if you really need a security blanket. Either way, we programmers should be manually messing with version numbers a whole lot less. They're a holdover from the pre-version-control pre-internet days of shrink-wrapped software, vestigial in a world of pervasive connectivity and constant push updates. All version numbers do is provide cover for incompatibilities to hide under.


    footnotes

    1. One exception here is Go, where the standard go get command requires no versions, and always grabs the head of the repo. However, the community seems to be turning from the light to the darkness with a proposal for a tool called go dep. It's unclear to me if this is due to a failure of communication on the part of the original authors of Go, or if there's a deeper justification for go dep that I'm missing. If you know, set me straight in the comments below.

    2. Following Steve Losh, we'll allow packages ending in "-0" to make incompatible changes to their heart's content.

    Making the big picture easy to see, in software and in society at large.
    selected
    Code
    Prose (shorter; favorites)