Hacker News new | ask | show | jobs
by SirensOfTitan 2312 days ago
I’ve been considering checking node_modules into source control for some time now, has anyone else done that successfully? There would be a variety of benefits:

1. Eliminate redownload of packages on every CI build 2. Reduce the amount of gigantic IO operations from unpacking the tens-of-thousands of files sitting in node_modules. 3. Better security: code checked in can be audited better if not downloaded every single CI build.

yarn’s PnP system is promising for the zero-install paradigm, but it doesn’t seem quite ready yet (so many packages don’t seem to get their dependencies right).

4 comments

Checking node_modules to git was the Preferred Way of working with dependencies in Node community in early days. Way before lockfiles, way before shrinkwrap and friends this way one could use `git diff` and `git bisect` to find out what dependency upgrade broke their application code. Several prominent community members and early adopters of Node advocated for this idea: they loved the idea of treating dependencies as integral part of your app, having good familiarity with the 3rd-party code you're using, etc.

However, early adopters of npm in the frontend world (back in Browserify and Require.js days) didn't like the practice (notably, because many parts of the dependencies contained node-only code, tests and scripts that were needed for building dependencies, etc.), and started putting node_modules in .gitignore. At the same time, Node people started to use other means to manage dependencies for reproducible builds: namely, private npm registries, dockerfiles, etc.

Over time both frontend and Node communities recognized the need for lockfiles, which we eventually got with Yarn and later versions of npm.

Yarn v1's "offline mirror" feature is explicitly meant for this use case. I wrote about it a few years ago, and have been successfully using it since then:

https://blog.isquaredsoftware.com/2017/07/practical-redux-pa...

Did that for a small Ivy-based project (Ivy is a simpler maven replacement) that had security implications.

We had a task every month for one developer to go manually upgrade one or two dependencies and commit the changes after testing (java libraries tend to upgrade much slower than Node).

Helps if you only have one platform you're developing on and deploying to (e.g. x86-64 Linux). If developing on macOS there can be Mac specific binaries installed, depending on the package.
That's why npm has a command `npm install --ignore-scripts`. It download the dependencies, but doesn't run the postinstall scripts (that either download pre-build binaries or run a compiler locally).

In early days of node (circa 2011-2013) we used to do the following: 1. run `npm install --ignore-scripts` first. 2. Check the node_modules folder to source control, 3. run `npm install` again - this time without the flag 4. put all extra files generated by install scripts to .gitignore

This way the third-party code (at least, the JS-part of that code) was in the repository, and every developer / server got the version of binaries for their architecture.

It wasn't a bullet-proof, though, since: 1. The scripts could do different things anyway 2. More importantly: one could upload a new version of library to npm with the same version number.

These days, lockfiles and stricter npm publishing rules largely eliminated both issues, and updating dependencies doesn't produce 10k-line diffs in git history anymore.

And if you do have more platforms, why not just check in one node_modules-directory for each?

This idea to redownload all packages all the time from external sources (and not even having a fallback-plan) seems completely brain-dead to me. Didn't the people learn from leftpad-gate?

> And if you do have more platforms, why not just check in one node_modules-directory for each?

Now you have to sync it or risk running into unreproducible build failures. Also, if you update the binary dependencies on say, macOS, then you still need some x86-64 Linux to build the dependency.

Not saying it is not possible but without a proper process (e.g. a build server being the only place that updates dependencies) this is going to be painful.