Behind the Curve: "New" vs "Compatible" in Node.js Package Development

The pace of Node.js development has created a complicated space for growing and maintaining reusable libraries. As new features are introduced, there's a certain pressure to keep up with the latest and greatest in order to simplify existing code and take advantage of new capabilities; but there's pressure in the opposite direction too, since projects which depend on the package aren't always themselves keeping up with Node.

My main open source project is Massive.js. It's a data access library for Node and the PostgreSQL relational database. I started participating in its development back before io.js merged back into Node and brought it up to ES6, and as of right now I'm still using it in one (not actively developed) product with an old-school callback-based API. I'm also relying on it in other projects with Node 8, the latest stable release line, so I've gotten to use a lot of the newer feature set which have collectively made Node development a lot more fun.

Given that libraries like mine are used with older projects and on older engines, the code has to run on as many of them as is practical. It's easy to assume with open source projects that if someone really needs to do whatever it is your package does in an engine from the stone age (better known as "yesterday" in Node) they can raise an issue or submit a pull request, or worst case fork your project and do whatever they have to to make it work. But in practice, the smaller the userbase for a package the less point there is to developing it in the first place, so there's a delicate balance to strike between currency and compatibility.

Important Numbers in Node.js History

  • 0.12: The last version before io.js merged back into Node and brought the newest version of Google's V8 engine and the beginnings of ES6 implementation with it.
  • 4: The major release series beginning with the reintegration of io.js in September 2015. Some ES6 language features such as promises and generators become natively available, freeing those Node developers able to upgrade from "callback hell". Node also moves to an "even major versions stable with long term support, odd major versions active development" release pattern.
  • 6: The 2016 long term support (LTS) release series rounds out the ES6 feature set with proxies, destructuring, and default function parameters. The former is a brand new way of working with objects, while the latter two are big quality-of-life improvements for developers.
  • 8: The 2017 LTS release series, current until Node 10 is released April 2018. The big deal here is async functions: promises turned out to still be a bit unwieldy, leading to the rise of libraries like co exploiting generators to simplify asynchronous functionality. With async/await, these promise management libraries are no longer needed.

What Maximum Compatibility Means

For a utility library like Massive, the ideal scenario for end users is one where they don't have to care which engine they're using. Still on 0.12, or even before? Shouldn't matter, just drop it in and watch it go. Unfortunately, not only does this mean Massive can't take advantage of new language features, it affects what everyone else can do with the package themselves.

The most obvious impact is with promises, which only became standard in 4.0.0. Prior to that, there were multiple independent implementations like q or bluebird, most conforming to the A+ standard. For Massive to use promises internally while running on older engines, it would have to bundle one of these. And that still wouldn't make a promise-based API useful unless the project itself integrated a promise library, since the only API metaphor guaranteed available on pre-4.0.0 engines is the callback.

Some of the most popular features which have been added to the language specification are ways to get away from callbacks. This is with good reason, although I won't go into detail here; suffice to say, callbacks are unwieldy in the best of cases. Older versions of Massive even shipped with an optional "deasync" wrapper which would turn callback-based API methods into synchronous -- blocking -- calls. This usage was wholly unsuitable for production, but easier to get off the ground with.

A Breaking Point

With the version 4 update, actively developed projects started moving toward promises at a good clip. We started seeing the occasional request for a promise-based API on the issue tracker. My one older project even got a small "promisify" API wrapper around Massive as we upgraded the engine and started writing routes and reusable functions with promises and generators thanks to co. Eventually things got to the point where there was no reason not to move Massive over to promises: anything that still needed callbacks was likely stable with the current API, if not legacy code outright.

This meant a clean break. The new release of Massive could use promises exclusively, while anything relying on callbacks would have to stay on the older version. By semantic versioning standards, an incompatible API change requires a new major version. In addition to complying with semver, releasing the promise-based implementation as 3.0.0 would allow urgent patches to be made on the existing 2.x series concurrently with new and improved 3.x releases.

Multiple Concurrent Releases with Tags

The npm registry identifies specific release series with a "dist-tag" system. When I npm publish Massive, it updates the release version on the latest tag; when a user runs npm install massive, whatever latest points to is downloaded to their system. Package authors can create and publish to other tags if they don't want to change the default (since without an alternative tag, latest will be updated). This is frequently used to let users opt in to prereleases, but it can just as easily let legacy users opt out of updates.

Publishing from a legacy branch in the code repository to a second tag means installing the most recent callback-based release is as easy as npm i massive@legacy. Or it could be even simpler: npm i massive@2 resolves to the latest release with that major version. And of course, package.json disallows major version changes by default, so there's no worries about accidental upgrades.

You can list active dist-tags by issuing npm dist-tag ls, and manage them through other npm dist-tag commands.

The One Time I Kind of Screwed Up

In July, a user reported an issue using Massive 3.x on a version 4 series engine. The version 6 stable release had been out for a while, and my active projects had already been upgraded to that for some time. The even newer version 8 series, with full async and await support, had just been released. The problem turned out to be that I'd unwittingly used default function parameters to simplify the codebase. This feature was only introduced in the version 6 release series, which meant Massive no longer functioned with version 4 engines.

Fixing the issue to allow Massive to run on the older engine would be a bit annoying, but possible. However, I had some ideas in the works that would require breaking compatibility with the version 4 series anyway: proxies are not backwards-compatible, so anything using them can only run on version 6 series and newer engines. Rather than fix compatibility with an engine which was now superseded twice over only to break it again later, I ultimately decided to leave well enough alone and clarify the engine version requirement instead.

Move Slowly and Deliberately and Try Not to Break Things

The main lesson of package development on Node is that you have to stay some distance behind current engine developments in order to reach the most users. How far behind is more subjective and depends on the project and the userbase. I think Massive is fine one full LTS version back, but a contrasting example can be found in the pg-promise driver it uses. Vitaly even goes as far as allowing non-native promise libraries to be dropped in, which hasn't strictly been necessary since 2015 -- unless you're stuck on an engine from before the io.js merge, which users of a more general-purpose query tool seem more likely to be.

Following semantic versioning practices not only ensures stability for users, but also makes legacy updates practical -- just check out the legacy branch, fix what needs fixing, and publish to the legacy tag instead of latest. One new feature and a couple of patches actually have landed on Massive v2 so far, but it's generally been quiet.

Having a clearly-defined standard for versioning has also helped manage the pace of continued development better: figuring out when and how to integrate breaking changes to minimize their impact is still tough, but it's vastly preferable to holding off on them indefinitely.