Hacker News new | ask | show | jobs
by lhorie 1696 days ago
> which to repeat, is something that happens all the time

Are you sure this isn't just a problem in your organization? As I qualified, the issue you're describing was a real pain maybe two or three years ago, but not anymore IME. For context, my day job currently involves project migrations into a monorepo (we're talking several hundred packages here) and non-reproducibility due to missing lockfiles is just not an issue these days for me.

As the other commenter mentioned, node-gyp is the main culprit of non-reproducibility nowadays, and committing deps doesn’t really solve that precisely because you often cannot commit arch-specific binaries, lest your CI will blow up trying to run mac binaries

1 comments

> Are you sure this isn't just a problem in your organization?

I'm really struggling to understand the kind of confusion that would be necessary in order for this question to make sense.

Why do you suspect that this might be a problem "in [my] organization"? How could it even be? When I do a random walk through projects on the weekend, and my sights land on one where `npm install` ends up failing because GitHub is returning 404 for a dependency, what does how things are done in my organization have to do with that?

I get the dreadful feeling that despite my saying "[That] means nothing if it's not my project", you're unable to understand the scope of the discussion. When people caution their loved ones about the risk of being the victim of a drunk driving accident on New Years Eve, it doesn't suffice to say, "I won't drink and drive, so that means I won't be involved a drunk driving accident." The way we interact with the whole rest of the world and the way it interacts with us is what's important. I'm not concerned about projects under my control failing.

> non-reproducibility due to missing lockfiles is just not an issue

Why do you think that's what we're talking about? That's not what we're talking about. (I didn't even say anything about lockfiles until you brought it up.) You're not seeing the problem, because you're insisting on trying to understand it through a peephole.

I mean, of course I'm going to see this from the lenses of my personal experience (which is that nasty non-reproducibility issues usually would only happen when someone takes over some internal project that had been sitting in a closet for years and the original owner is no longer at the company). Stumbling upon reproducibility issues in 4 year old projects on Github is just not something that happens to me (and I have contributed to projects where, say, Travis CI had been broken in master branch for node 0.10 or whatever) and getting 404s on dependencies is something I can't say I've experienced (unless we're talking about very narrow cases like consuming hijacked versions of packages that were since unpublished) or possibly a different stack that uses git commits for package management (say, C) - and even then, that's not something I've run into (I've messed around w/ C, zig and go projects, if it matters). I don't think it's a matter of me having a narrow perspective, but maybe you could enlighten me.

As I mentioned, my experience involves seeing literally hundreds of packages, many of which were in a context where code rot is more likely to happen (because people typically don't maintain stuff after they leave a company and big tech attrition rate is high, and my company specifically had a huge NIH era). My negative OSS experience has mostly been that package owners abandon projects and don't even respond to github issues in the first place. I wouldn't be in a position to dictate that they should commit node_modules in that case.

Maybe you could give me an example of the types of projects you're talking about? I'm legitimately curious.

> I don't think it's a matter of me having a narrow perspective, but maybe you could enlighten me.

I have to think it is, because the "shape" of your responses here have been advice about things that I/we can do to keep my/our own projects (e.g. mission critical or corporate funded stuff being used in the line of business) rolling, and completely neglected the injured-in-collision-with-other-intoxicated-driver aspect.

Again, I _have_ to think that your lack of contact with these problems has something to do with the particulars of your situation and the narrow patterns that your experience falls within. Of the projects that match the criteria, easily 40% of them are the type I described. (And, please, no pawning it off with a response like "must be a bad configuration on your machine"; these are consistent observations over the bigger part of a decade across many different systems on different machines. It's endemic, not something that can be explained away with the must-be-your-org/must-be-your-installation handwaving.)

> code rot is more likely [...] My negative OSS experience has mostly been that package owners abandon projects and don't even respond to github issues

Sure, but the existence of other problems doesn't negate the existence of this class of problems.

> I wouldn't be in a position to dictate that they should commit node_modules in that case.

Which is why I mentioned that this needs to be baked in to the culture. As it stands, the culture is to discourage simple, sensible solutions and prefers throwing even more overengineered tooling that ends up creating new problems of its own and only halfway solves a fraction of the original ones. (Why? Probably because NodeJS/NPM programmers seem to associate solutions that look simple as being too easy, too amateurish—probably because of how often other communities shit on JS. So NPMers always favor the option that looks like Real Serious Business because it either approaches the problem by heaping more LOC somewhere or—even better—it involves tooling that looks like it must have taken a Grown Up to come up with.)

> Maybe you could give me an example of the types of projects you're talking about? I'm legitimately curious.

Sure, I don't even have to reach since as I said this stuff happens constantly. In this case, at the time of writing the comments in question, it was literally the last project I attempted: check out the Web X-Ray repo <https://github.com/mozilla/goggles.mozilla.org/>.

This is conceptually and architecturally a very simple project, with even simpler engineering requirements, and yet trying to enter the time capsule and resurrect it will bring you to your knees on the zeroeth step of just trying to fetch the requisite packages. That's to say nothing of the (very legitimate) issue involving the expectation that even with a successful completion of `npm install` I'd still generally expect one or more packages for any given project to be broken and in need of being hacked in order to get working again, owing to platform changes. (Several other commenters have brought this up, but, bizarrely, they do so as if it's a retort to the point that I'm making and not as if they're actually helping make my case for me... The messy reasoning involved there is pretty odd.)

> check out the Web X-Ray repo <https://github.com/mozilla/goggles.mozilla.org/>.

Thanks for the example! Peeking a bit under the hood, it appears to be due to transitive dependencies referencing github urls (and transient ones at that) instead of semver, which admittedly is neither standard nor good practice...

FWIW, simply removing `"grunt-contrib-jshint": "~0.4.3",` from package.json and related jshint-related code from Gruntfile was sufficient to get `npm install` to complete successfully. The debugging just took me a few minutes grepping package-lock.json for the 404 URL in question (https://github.com/ariya/esprima/tarball/master) and tracing that back to a top-level dependency via recursively grepping for dependent packages. I imagine that upgrading relevant dependencies might also do the trick, seeing as jshint no longer depends on esprima[0]. A yarn resolution might also work.

I'm not sure how representative this particular case is to the sort of issues you run into, but I'll tell you that reproducibility issues can get a lot worse in ways that committing deps doesn't help (for example, issues like this one[1] are downright nasty).

But assuming that installation in your link just happens to have a simple fix and that others are not as forgiving, can you clarify how is committing node_modules supposed to help if you're saying you can't even get it to a working state in the first place? Do you own the repo in order to be able to make the change? Or are you mostly just saying that hindsight is 20-20?

[0] https://github.com/jshint/jshint/blob/master/package.json#L4...

[1] https://github.com/node-ffi-napi/node-ffi-napi/issues/97

I don't understand your questions.

My message is that for a very large number of real world scenarios the value proposition of doing things the NPM way does not result in a Pareto improvement, despite conventional wisdom suggesting otherwise.

I also don't understand your motivation for posting an explanation of the grunt-related problem in the Web X-Ray repo. It reminds me of running into a bug and then going out of my way to track down the relevant issue and attach a test case crafted to demonstrate the bug in isolation, only to have people chime in with recommendations about what changes to make to the test case in order to not trigger the bug. (Gee, thanks...)

And to reiterate the message at the end of my last comment: the rationale of trying to point at how bad mainstream development manages to screw up other stuff, too, is without.

Personally, I see code as a fluid entity. If me spending a few minutes to find a way to unblock something that you claimed to "bring you to your knees on the zeroeth step" is a waste of time, then I guess you and I just have different ideas of what software is. For me, I don't see much value in simply putting up with some barely working codebase out of some sense of historical puritanism; if I'm diving into a codebase, the goal is to improve it, and if jshint or some other subsystem is hopelessly broken for whatever reason, it may well get thrown out in favor of something that actually works.

You may disagree that the way NPM does things works well enough to be widely adopted (and dare I say, liked by many), or that true reproducibility is a harder problem than merely committing files, but by and large, the ecosystem does hum along reasonably well. Personally, I only deal with NPM woes because others find value in it, not because I inherently think it's the best thing since sliced bread (in fact, there are plenty of aspects of NPM that I find absolutely atrocious). My actual personal preference is arguably even less popular than yours: to avoid dependencies wherever possible in the first place, and relentless leveraging of specifications/standards.