Hacker News new | ask | show | jobs
by tomdale 3537 days ago
This is a huge leap forward for the JavaScript community—probably more than many people will realize right away.

I loved Bundler's deterministic builds but chafed against the Ruby limitation of only having a single version of a dependency at once. npm solved this problem elegantly, but still struggles with non-determinism. I had resigned myself to thinking that maybe these were just fundamental tradeoffs in package management that could never be resolved.

Then I had the opportunity to use Cargo, the package manager for Rust. It synthesized what, in my mind, are the best features of npm and Bundler into a single package manager, with terrific speed to boot.

Yarn looks like it will be Cargo for JavaScript. After Cargo improved on many of the great ideas of npm, those ideas are returning to help improve the JavaScript ecosystem. Cross-pollination at its best.

This also highlights a shrewd move on the part of npm: a clear decoupling between the npm registry and client, with a well-defined protocol between the two. The strength of Node is in the staggering size of its ecosystem; how those bits end up on disk is an implementation detail. This smart separation allows for this kind of experimentation on the client side, without causing the ecosystem fragmentation that happens when a new package manager requires a new registry.

I'm also happy to see that a significant amount of work has already gone into the governance model. Despite the largest contributors being Facebook employees, it looks like they've really outdone themselves making sure this is a community-run project. A BSD license, no PATENTS file, and an Ember/Rust RFC process[1]; this has all the hallmarks of a community open source project. That's critical for open infrastructure projects like this, and they've nailed it.

[1]: https://github.com/yarnpkg/rfcs

I'm very much looking forward to using Yarn in my own projects because it looks like it solves a lot of real problems I encounter every day. Thanks for all the hard work!

6 comments

> This also highlights a shrewd move on the part of npm: a clear decoupling between the npm registry and client, with a well-defined protocol between the two. The strength of Node is in the staggering size of its ecosystem; how those bits end up on disk is an implementation detail. This smart separation allows for this kind of experimentation on the client side, without causing the ecosystem fragmentation that happens when a new package manager requires a new registry.

And a shrewd move by FB: to not announce their new registry on the same day.

> Yarn pulls packages from registry.yarnpkg.com, which allows them to run experiments with the Yarn client. This is a proxy that pulls packages from the official npm registry, much like npmjs.cf.[0]

Time will tell whether they only want to be proxying NPM or will allow direct pushing to their own registry. If they do, JS ecosystem might see another big shift.

[0] http://blog.npmjs.org/post/151660845210/hello-yarn

That big shift will have to happen first. I don't see them ever making their own registry unless 99.99% of people are using yarn and are having a lot of problems with the current npm registry. While I see a lot of people using yarn, I'm not sure about 99.99% and I think npm's registry itself is pretty good.

So I don't think interests will ever align to create a new registry. Nobody wants to do that. That would have serious consquences for the JS community and would take years to recover, in my opinion.

Why would it be a bad thing to support additional repositories? Personally, I don't like how centralized the JS ecosystem is.

For example, if I refer to 'left-pad', it would default to 'npmjs.org/left-pad'. If the author goes rougue, I think it would be great to enable people to publish and consume 'thirdparty.com/left-pad'

Disclosure: I'm a FB employee with no knowledge of our plans in this regard

1. any left-pad issue has been "eliminated" with the new rules npm (the company) has enforced.

2. you can already have your own version of thirdparty/left-pad by maintaining your own npm repository. i think what your parent post is referring to is facebook saying we're going to make our own public node/javascript package registry and you should publish to our registry.

doing this at the moment does nothing for the community other causing a lot of pain points. ex) now npm authors will have to publish to both registries so developers don't have to dig to find where it was published to, then they also have to hope that someone else didn't register the module name in one of the registries..

there is just too much splintering if facebook decided to become a competing registry rather than just using npm's registry and building on top of it.

> any left-pad issue has been "eliminated" with the new rules npm (the company) has enforced.

Only to resurface again: http://status.npmjs.org/incidents/dw8cr1lwxkcr

And it will, no doubt, resurface again and again and again

Somewhat funny to have an FB employee complaining about overly centralized systems.
I used to find such comments funny until I began working for Big Name Corps myself and I realized how much one's personal philosophy could be inconsistent or sometimes even contradictory to the employer's philosophy. I make money by selling my skills to an employer despite inconsistent philosophies. I think it's like a chef that can cook meat for his/her guests although the chef has decided to refrain from consuming meat himself/herself.
Have a read here: https://github.com/nodejs/NG/issues/29#issuecomment-17431452... - it's a writeup that I made a while ago about the requirements for a stable (decentralized) package registry, and it addresses your question as well.
> So I don't think interests will ever align to create a new registry.

Possibly not soon, at least from a technical perspective, but I could definitely see a PR fiasco (security, bias, etc.) causing a loss of confidence in its stewardship.

And I don't think it'd be that big of a disruption if it were to happen; for the hypothetical case of Facebook+Yarn, they're already proxying to NPM, so they could easily continue to do so while also accepting packages that are Yarn-only.

I more or less agree with this.
Even if they don't get 99% of users (or any major percentage) I think it would still benefit the community to have an alternative to npm. Also Facebook has the advantage of not needing to make a business out of it (they already got a pretty good one) so in theory it could be entirely open source and free.
Just to expand on the stricter versioning rules that I mentioned, some things that in my opinion could improve the reliability of a package repository:

- strictly defined version update time intervals, e.g. you can't update your package more than once a week (or have to take some special actions for critical updates, e.g. contact support)

- "delayed" publishing, e.g. when you submit your package it will only be published in 24 hours, until then you can re-submit updates, etc.

- similar to above, but your package wont be published until it was tagged on github (or elsewhere) for a certain amount of time

- published packages can not be removed, but you can disassociate them with your account by marking them as "not maintained" and possibly assign new maintainers for it

- maybe introduce some way for developers to mark new versions as "backwards incompatible" if they do break backwards compatibility

I think there is definitely a "market" for some stricter node package repo.

I find it strange that the time isn't invested in already existing projects.

But at least it's a move away from NPM. I think the most problems I had with JavaScript develompent in the last 2 years came from NPM.

Yarn isn't a replacement for npm itself. It's a client that can read/write to npm, and other registries such as Bower.
Well, it seems like it could be according to the article. What would you still need to use NPM (the cli) for, other than package hosting?

According to http://blog.npmjs.org/post/151660845210/hello-yarn, it seems it doesn't work with private packages yet, which may or may not be an issue for your project. But it seems this is a complete CLI replacement for NPM.

Yes, it replaces the client, but it uses the same package repository and package format.
And thus could have been implemented as part of the existing client.
"Could have been implemented as part of the existing client" isn't the same as "Should have been implemented as part of the existing client".

I personally don't know much about either tool (don't do a ton of JS), but it's possible that fixing the existing client without either breaking backward compatibility or making it too complicated (multiple modes of operation) was too difficult or not worth it.

Also, I'm having a really hard time understanding the complaint about a new client. The value is in repository of reusable code, not the client. That you can use different clients with the same repository is a feature, not a bug.

npm = node package manager = CLI tool which Yarn replaces.

npmjs.com = package repo which Yarn can use.

at least as far as I can tell

I'm almost sure complain was about npm as a tool, not about repo :)
Would you mind going a bit into detail? I work with npm on a daily basis and I am pretty happy so far.
The whole version range stuff got me many times. I went to use fixed versions on my own package.json files, but the deps of my deps could still be dynamic, which is even worse, since they sit deeper in my dependency graph AND there are more indirect deps than direct deps. (~50 direct, >200 indirect)

Also, npm isn't deterministic and it got even worse with v3. Sometimes you get a flat list of libs, if a lib is used with multiple versions, the first usage will get installed flat, the rest in the directory of the parent lib, etc.

The npm-cli is basically a mess :\

The fun of kicking off a CI build after the weekend with no commits and see stuff randomly break because some dependency of a dependency got updated and broke things in a minor version is something I've only experienced in JS - beautiful.
In fairness to the language and tools, this seems to be more of a cultural problem than anything.

You can do the same kind of version range tricks in typical Java builds, for example (Maven), but most people hardcode the values to keep builds as deterministic as possible.

For some reason, the JS community seems to prefer just trusting that new versions won't break anything. Its either very brave of them really (or maybe just foolish).

> the JS community seems to prefer just trusting that new versions won't break anything. Its either very brave of them really (or maybe just foolish).

Let's not pretend that we aren't all blindly tossing in random libs of dubious quality and origin we find on github into our package.json and hoping for the best anyway. My company talks a mean talk about "best practices", but, my god, the dependencies we use are a true horror show.

I hate non-reproducible builds and semver-relaxed dep-of-the-dep issues, but, while a broken dep fails the build for lots of people (downside), the upside of this is that very quickly (within hours of a new dep being published) there will be lots angry people complaining about it on GitHub, and a faulty dep will be typically quickly rolled back / superseded with a patch. Otherwise, some bugs might be sitting hidden for a long time.
But can yarn fix this?

Say that I use yarn to depend upon some module X, specifying a fixed version number like the good boy scout that I am. Module X resides on npmjs and in turn depends upon the bleeding edge version of module Y. And then one day module Y goes all Galaxy Note and bursts into flames.

Can yarn shield my app from that?

It's not a cultural problem. People make mistakes. People don't know what a non-breaking change is, especially those not well versed in refactor work.

I don't think Yarn solves any of these problems, tbh. It seems like what we really need is a package manager that tests the api of each package update to make sure nothing in it has broken expectation in accordance with Semver.

You shouldn't be kicking off CI builds based on unpinned deps (unless you're deliberately doing 'canary testing' etc), because of course that will break. The npm solution for this is to use 'npm shrinkwrap' and you should always have been using this at your site/project level otherwise there was no hope it could work.

It's not that npm devs were naive enough to believe that unpinned deps would be safe for reproducible builds.

However I've heard several people allude over the years that 'npm shrinkwrap' is buggy, and isn't fully reproducible (though never experienced any problems personally). This is the aspect yarn claims to address, along with a supposedly more succinct lockfile format.

With or without npm-shrinkwrap.json? Not chastising, I'm sincerely asking.
Obviously without until we broke and looked in to it :D

JS dev has been a minefield like that, the entire ecosystem reminds me of C++ except lower barrier to entry means a lot of noise from newbies on all channels (eg. even popular packages can be utter crap so you can't use popularity as a meaningful metric)

I agree 100%, but the default upgrade strategy for npm --install does not help matters: it's much saner to wildcard patch versions only and lock major.minor to specific versions.

this obviously doesn't fix anything and I think the points in this discussion stand, but I've never understood why the defaults are not more conservative in this regard.

Well, now you can do that in Cargo as well :-)

What we do currently is we lock everything to an explicit version - even libraries.

At least it's possible to get deterministic builds if you are willing to do a bit of work carefully / manually updating all of your dependencies at once.

You shouldn't have that happen with Cargo, given that we have a lockfile. Even when you specify version ranges, you're locked to a single, specific version.
npm shrinkwrap
NPM is deterministic when there using the same package.json and there is no existing node_modules folder.

And if you want to lock versions for your entire dependency tree, npm shrinkwrap is what you're looking for (It's essentially the same as lockfiles in other development package managers). Though for security reasons I prefer to keep things locked to a major version only (e.g. "^2.0.0"). Shrinkwrapping is useful in this instance too if you need to have predictable builds (and installs as it'll use the shrinkwrap when installing dependencies too if it's a library rather than the main app) but want to ensure your dependencies stay up to date.

It's not perfect by any measure, but there are ways to make it work the way you want.

Mhm that sounds reasonable. I haven't gotten into problems with versions yet but I don't really trust npm update, because I am not always sure how it behaves.

From a build tool perspective (we use npm scripts for basically everything [and some webpack]) I am also not missing something particular.

Looking at other comparable solutions (from other languages) I'd say npm does a pretty good job.

sure, it's better than pip or something.

but it still far from perfect.

the main problem with npm is just when its used internally. It is painful really. Using it when you have internet connection is just seamless
Multiple versions may sound like it's useful, but it's almost always a bad idea. Cargo doesn't allow it either.

The problem isn't really fundamental. Bundler makes almost all the right choices already. Its major disadvantage is that it only works for Ruby.

As a practical matter, the npm ecosystem today relies on duplication, and no new client that made the "highlander rule" (there can be only one) mandatory could succeed.

Yarn does offer a `--flat` option that enforces the highlander rule, and I'm hopeful that the existence of this option will nudge the ecosystem towards an appreciation for fewer semver-major bumps and more ecosystem-wide effort to enable whole apps to work with `--flat` mode.

Plz send halp!

Explain why duplication is mandatory?
Just imagine two packages you depend on (a and b) that both have a shared dependency (x). Both start off depending on x version 1.0 but then later a is updated to 2.0 while b isn't. Now you have two packages depending on different versions of the same package and hence the need for duplication. You have a that needs x@2.0 and b that needs x@1.0, so both copies are kept.
Don't upgrade a when it wants a half-baked x. Choose versions of a and b that agree on a known-good version of x. If there aren't any, it's not sane to use a and b together unless x is written very carefully to accommodate data from past and future versions of itself.
It's not as simple as that. Lodash is a great example of why the highlander rule doesn't work within the npm ecosystem: older versions are depended on by many widely-used packages which are now "complete" or abandoned. Refusing to use any packages which depend on the latest version of Lodash is just not practical.
That's not how it works. There will be two copies of x in the require cache. They don't know of each other's existence.
Would it be possible to create hardlinks or symlinks to a particular package/version pair shared as a dependency between other packages? I know this only works on unix-like OSes but otherwise it could revert to the old behaviour of duplicating the dependency.
I think they're just saying that any new client that tried to not support duplication at all would likely quickly run into a large amount of npm packages/package combinations that just don't work. So within the context of using the npm registry duplication is mandatory.
Cargo definitely allows two dependencies to rely on different versions of a transitive dependency. If those deps expose a type from that dependency in their API, you can very occasionally get weird type errors because the "same" type comes from two different libraries. But otherwise it Just Works(tm).
Cargo does allow multiple versions of transitive dependencies. It tries to unify them as much as possible, but if it can't, it will bring in two different versions.

What it _won't_ let you do is pass a v1.0 type to something that requires a v2.0 type. This will result in a compilation error.

There's been some discussion around increasing Cargo's capacity in this regard (being able to say that a dependency is purely internal vs not), we'll see.

With tiny modular utilities this is very much necessity - and not a bad idea if the different versions are being used in different contexts.

For instance, when using gemified javascript libraries with bundler it is painful to upgrade to a different version of a javascript library for the user facing site while letting the backend admin interface continue using an older one for compatibility with other dependencies.

You've got to take the ecosystem into account. There are a lot of very depended-upon modules that are 1 KB of code and bump their major version as a morning habit. Forcing people to reconcile all 8 ways they indirectly depend on the module would drive them nuts, but including 4 copies of the module doesn't hurt much.
Yarn supports `yarn [install/add] --flat` for resolving dependencies to a single version
> Multiple versions [...] almost always a bad idea

If so, different major versions of the same dep should be considered different libraries, for the sake of flattening. Consider lodash for example.

That's exactly what Cargo does, but it also takes a further step of making `^` the default operator and strongly discouraging version ranges other than "semver compatible" post-1.0.
Looks like Yarn does something similar: https://yarnpkg.com/en/docs/cli/add#toc-yarn-add-exact

Personally I'm not really sure I like it. If I specify an exact revision of something, chances are I really do mean to install that exact revision. I don't see why I need an extra flag for that.

That can cause some serious problems in at least some portion of times. I've dealt with the subtle errors that have been caused by this problem in c++, and don't really know javascript libraries that well so I can't give a more concrete example. But imagine that there are the following libraries:

* LA: handles linear algebra and defines a matrix object.

* A: reads in a csv file and generates a matrix object using LA

* B: takes in a matrix object from LA, and does some operations on it

In this case, if B depends on version 5 of LA and the new version of A depends on version 6 of LA, then there's going to be a problem passing an object that A generated from version 6 and passing it to B which depends on version 5.

The problem does happen in JavaScript. But since its unityped, there is a strategy to deal with it

* Figure out early on (before 1.0) what your base interface will be.

For example, for a promise library, that would be `then` as specified by Promises/A+

* Check if the argument is an instance of the exact same version.

This works well enough if you use `instanceof`, since classes defined in a different copy of the module will have their own class value - a different unique object.

  * If instanceof returns true, use the fast path code (no conversion)
  * Otherwise, perform a conversion (e.g. "thenable" assimilation) that
    only relies on the base interface
Its not easy, but its not always necessary either. Most JS libraries don't need to interoperate with objects from previous versions of themselves.
Would this even work in what I describe? For instance, if mat.normalize() was added in LA-6, and B provides an LA-5 mat, and then A (which has been updated to use the new method) calls mat.normalize() on the LA-5 mat expecting an LA-6 mat but because of duck-typing that method doesn't exist.
It would not.

However, since A exposes outside methods that take a matrix as an argument, it should not assume anything beyond the core interface and should use LA-6's cast() to convert the matrix.

The problem is partially alleviated when using TypeScript. In that case the inferred type for A demands a structure containing the `normalize` method, which TypeScript will report as incompatible with the passed LA-5 matrix (at compile time). That makes it clearer that `cast` would need to be used.

Define the matrix class in a seperate library for compatibility purposes if it's so widely used and doesn't change. Maybe some people don't need both the linear algebra and instead only the definition of the matrix object?

Another solution is to provide version overrides and make B depend on version 6.

However if there are differences in the matrix class between different versions of the library then you're forced to write a compatibility layer in any case.

If an API expects the outside world to hand it an instance of a specific library, all bets are off. Maybe it gets `null` or the `window` object, who knows? But a library can at least declare what dependencies it wants. If you take that away, it ratchets up the uncertainty factor that much more.
This is especially true when you're going to be serving your code over the web. It's very easy when using npm to accidentally end up shipping a lot of duplicate code.

That alone has me super excited about Yarn, before even getting into the performance, security, and 'no invisible dependency' wins.

I feel like this is especially problematic when using NPM for your frontend. Now you have to run all your code through deduplication and slow down your build times or end up with huge assets. I wonder if it's really worth the trouble.
> I loved Bundler's deterministic builds but chafed against the Ruby limitation of only having a single version of a dependency at once

This is due to the way that Ruby includes code, where it's available globally, vs Node where code is scoped to the including module. I'm not sure how Ruby could support multiple versions with changes to the language and/or RubyGems

Yes, that's why I said it was a Ruby limitation, not a Bundler limitation.
Right, my comment wasn't neccesarily directed at you, but to others who might not be familiar with both
Why do you think that everybody is inspired from javascript tech? It is probably the other way around. For example there were very good build tools long before npm which are still used today and unmeasurably better (like Maven or Gradle).
funny (and great!) how cargo became a gold standard for package management without any advertising. well done, rustaceans.