Hacker News new | ask | show | jobs
by no_wizard 1089 days ago
Does anyone have a detailed understanding of why CommonJS (and its async incarnation, AMD) were not adopted by browsers?

I do much like the `import` syntax personally and its a little cleaner to read, but CommonJS and AMD were the undisputed winners of the module format until ES Modules were born. Not that I have a problem with ES Modules, I don't, however I am interested in what was so insufficient about the preceding formats that we couldn't have standardized on them

EDIT: I know about the deal with CommonJS being synchronous. That isn't per se an issue I don't think, esp. because AMD built on top of CommonJS primitives, and with minimal refactoring CommonJS code could be used in the browser when defined this way if asynchronicity is a must. Generally, what I "imagine" browsers doing with CommonJS is making the `require` calls async in the background (IE non visible to developers) so they can resolve the modules then parse the code. This isn't terribly different from how import statements work today.

I'm wondering why we didn't undertake the work to just improve the existing format, more or less.

EDIT 2: I'm interested from a historical perspective. I think ESM is the right choice and 100% the future.

9 comments

I can speak to the one listed in the article: "difficult to tree-shake, which can remove unused modules and minimize bundle size."

This is because of a much deeper issue: static analysis is highly complex with the near-free-for-all that is CommonJS require & module.exports syntax. ES Modules is stricter and much easier to statically deal with.

At a high level, why? You can throw just about anything in an exports.module statement, and the syntax to "require" it also has a lot of leeway. You can actually see the code for this in the Node codebase--module resolution is handled in javascript @ /lib/internal/modules/cjs/loader.js vs /lib/internal/modules/esm (heads up, both approaches are a Lot to grok)

Understand that with the CJS approach, you can dynamically export modules at runtime under whatever name you wish, with whatever value you want, which may even include dynamic require statements themselves. Nightmare for static analysis.

It makes a lot more sense if you try it for yourself. Build a module resolution algorithm including: determining all the imports, all the files those imports are from, mixing with 3rd party and local imports, and building that chain recursively.

You can do it, but the edge cases surrounding CommonJS make it super difficult. I'd go so far as to say it's basically impossible to get 100% success in all the desired scenarios without directly invoking the code.

While I agree the dynamic nature of CommonJS would be problematic, there were successful projects around treeshaking commonjs[0] that worked really well.

I think dynamic imports have some of the same footguns here, to be honest. Can't deny ESM is easier to statically analyze though, that much appears to be true across the board based on available evidence.

[0]: https://github.com/indutny/webpack-common-shake

To be honest, I think the AMD incarnation is a complete non-starter. It’s just such a funky, weird little thing that only makes sense because it’s a compatibility shim. Nobody wants to directly author AMD, and someone shipping a JS implementation wants to ship features that people will use directly.

I mean, I guess people will directly write AMD modules, and make modules using some giant script that uses cat, but the future of JavaScript lies with making each source file a valid, correct piece of JavaScript. When each source file is valid and correct, and doesn’t need to be preprocessed in order to work, your tooling will work a lot better.

The browser authors know you can’t un-ship JavaScript features. ES6 import/export is damn good stuff, and people in the browser aren’t saddled with some weird compatibility shim like AMD.

The adoption of ES6 modules in the client-side landscape has far outstripped its adoption in Node.js. I honestly can’t wait for require() to die, in both its cjs and AMD variations. The tooling support for ES6 modules is miles better.

+this ... AMD was just weird and clunky to use in practice... CJS bundlers were much easier to grasp by comparison, and when browserify (and those that followed) came out, it was kind of a no-brainer at the time. If ESM were finalized maybe even 2 years earlier, we'd probably be using that already for everything. I think the Node team choosing to make esm/cjs interop more difficult than what babel had been doing slowed down the switch. I get the reasons why, I just don't agree with the approach in the end. I think if they made the interop good, and declared after Node v#, it would be esm only, that would have worked out better for everyone. The risk being a kind of Python 2->3 paralysis. If the interop was good for a few versions, I don't think the friction would have been that bad. Then after 2-3 years, the switch could have been much cleaner.
Yeah. The interop between ESM and CJS in Node is beyond horrible. My sense is that the developers are trying to iron out some differences so ESM can behave according to spec and CJS can also behave according to spec. These have to be problems with edge cases, right?

I don’t quite understand why you need the .cjs/.mjs stuff, either. You can tell the difference between an ES2015 module and CommonJS module after parsing, with the one exception of modules that have no imports and exports (which should be rare). What is the holdup, then? What’s breaking?

CommonJS requires invoking the code before the modules can be resolved, versus the ESModule syntax with "import" can be parsed out of the code separately (from the AST because it is a keyword). No invocation required.

I don't know if that's the entire story -- probably not -- but I do know that is one major differentiator for things like generating import-graphs and performing tree shaking.

(you can still do like `import('foo' + someVar)` which will only invoke dynamically at runtime, so I'm not sure how that case is dealt with)

> (you can still do like `import('foo' + someVar)` which will only invoke dynamically at runtime, so I'm not sure how that case is dealt with)

That case is dealt with more like a `fetch('foo' + someVar).then(r => eval(r.text()))` or similar (but of course it is not just a eval and it instead returns the exports of the module).

Dynamic imports and static ones behave very differently and static analysis generally ignores dynamic imports IIRC.

You also need to treat dynamic imports as async including everything that comes with that (error checking, awaiting, etc.)

`import(...)` returns a `Promise`, so it can resolve in the future after the file is parsed and compiled.
From what I remember, reading the conversations over the years...the issue was twofold:

1. Because `require()` is "just a magic function", it can't be statically analyzed by a JS runtime prior to actually running the code. This leads to limitations with regards to tree-shaking and other optimizations.

2. The last point leads to the even bigger (and probably "deal-breaker") reason for the change, the desire to fetch packages from URL sources. Since the syntax cannot be parsed efficiently, runtimes like Deno and Bun would have a much harder time fetching resources from URLs prior to running the code. The idea here, IIRC, was to eliminate the install step, the need for centralization on a single package manager and registry, and a general "non-Web" approach to the idea of packages and modules in JS.

I believe the `import` syntax was chosen to allow transitions away from `require()`, so that your programs wouldn't just stop working if ESM was enabled.

Your first point is absolutely spot-on but I am curious as to how much treeshaking was on the minds of masses at the time. The tooling of that era didn't really have any good support for tree shaking even for non-AMD includes and it was quite experimental tech (as in, I don't think it was a decision making factor for the majority of the tools on the scene).

The second point actually isn't strictly valid. I've written my own "all-in-one" async custom loader [0] that can require() CommonJS/AMD includes, regular "add a script tag" includes w/out any exports, or even css stylesheets all asynchronously, with asynchronous dependency trees for each async dependency in turn. You can define in the HTML source code a "source map" that maps each dependency name to a specific URL, so that you don't need knowledge of the filesystem tree to load dependencies.

Ideally, this source map can be generated via the tooling you use to compile the code (e.g. `tsc` is aware of the path to each dependency) but I haven't written my own tool to generate the require path to url map.

[0]: https://github.com/mqudsi/loader

Why can't implementations tree-shake through require(x) where x can be determined statically and warn where x cannot?
the import syntax makes it possible for the browser to start loading dependencies as soon as the module has been parsed, before it finishes being compiled. the require function would force the browser to download the script being depended on then and there, and since it's not async, the browser would need to pause script execution while the module is being loaded.
I don’t have an answer, and this is kind of superficial, but one thing I felt about the two was that import statements feel like compilation instructions. “Statically link this.” While Commonjs was a runtime function “synchronously acquire and parse this.”

I’m going to guess the good faith answer really involves some version of “CommonJS has some shortcomings and we didn’t want to confusingly write mostly-same syntax so we designed something new based on ideas from numerous languages.”

> and its async incarnation, AMD

A bare-bones implementation of AMD could be put together with less than a kilobyte of JavaScript (this is what we used at Mozilla for a minute circa 2012). Meanwhile, the ECMAScript folks were working on ES6, which was going to have a module system. Why would the browser build in support for a highly-opinionated system that you could implement yourself so trivially, all while a TC39-blessed standard was in the works?

> what I "imagine" browsers doing with CommonJS is making the `require` calls async in the background (IE non visible to developers) so they can resolve the modules then parse the code

That's not possible. You need to run the code to know what's being required: if I call `require('./' + getModuleName())`, you don't know what's being required until `getModuleName()` is evaluated. So you actually need to start running the JS. You need to pause execution of the code calling `require()` (a la `alert()`), and then you can download and parse the required module. When the file is downloaded, you can parse and execute the imported module. Each file would need to be downloaded/parsed/executed _synchronously_ in the order that each `require()` happens in: it's only async in so far as the JS pauses execution and picks up later.

> This isn't terribly different from how import statements work today.

Not so. You can find and resolve `import` statements (note: not `import()` calls, though these return Promises) without executing a JS file. You can parse the imports out of a file in one pass and fetch/parse/repeat for each import in the dependency tree before anything starts executing. Since "native" imports are static and declarative, you can resolve all of them without ever executing any code. And any dynamic imports return promises that the programmer needs to explicitly handle the behavior of at runtime.

> just improve the existing format

1. You'd have to kill dynamic imports (passing anything other than a string literal to `require()`, which would be impossible to do without breaking compatibility and couldn't be polyfilled.

2. AMD allowed a callback syntax for `require()` (it came out years before promises), which is cumbersome. Adding promises later would be challenging and leave technical debt.

> Adding promises later would be challenging and leave technical debt.

I wrote an "aio loader" many years ago that can load (in the browser) AMD/CommonJS/node or just "include this script in your html" dependencies that asynchronously loads dependencies (and their own dependencies) with support for use via plain `require()` without callbacks, `require(foo, foo => {})` callback support, and even dynamic async loading (`var App = await requireAsync("foo")`).

I never published it publicly (it's just ticking away on our production sites) but I was motivated to push it to GitHub just now [0].

[0]: https://github.com/mqudsi/loader

FWIW they did it[0]. Alameda is promise based AMD from the same folks that were key in AMD being successful when it was so successful

[0]: https://github.com/requirejs/alameda

Alameda does a great job of explaining why you can't just slap promises on something: it breaks the semantics of what `require()` returns. `require('foo')` returns foo, maybe. `require(['foo'])` returns `Promise<[foo]>`.
That's correct. For my own library (see sibling comment above) I started off with a single `require()` entry point that can be used to load a dependency, load a dependency and invoke a callback, or asynchronously load a dependency (e.g. return a promise) but then changed it to two separate functions (everyone's favorite `require()` plus an async version very cleverly named `requireAsync()`).
> Does anyone have a detailed understanding of why CommonJS (and its async incarnation, AMD) were not adopted by browsers?

I only started my career in ernest in 2012, but even then compatibility with old versions of IE was a major point, due to their high market share.

IE6 was officially retired in 2014, but even then it still accounted for 4.2% of the traffic:

https://www.computerworld.com/article/2488448/ie6--retired-b...

Then there were IE8-11, but it was IE6 which lingered way past its welcome, considering it was originally released in 2001.

> That isn't per se an issue I don't think, esp. because AMD built on top of CommonJS primitives, and with minimal refactoring CommonJS code could be used in the browser when defined this way if asynchronicity is a must.

This existed: the UMD module format was the turducken you got if you built modules to work both as AMD and CommonJS at the same time. AMD wrappers, async require, and a bunch of boilerplate to determine if the module was being loaded by an AMD loader or in a CommonJS environment (or worst of all, a CommonJS environment with AMD loader primitives).

It was a lot of ugly boilerplate. I don't think I ever saw a project intentionally write UMD modules by hand. I do recall some Typescript projects that distributed as UMD modules for a while, because that was boilerplate Typescript was always good at streamlining.

> I do much like the `import` syntax personally and its a little cleaner to read, but CommonJS and AMD were the undisputed winners of the module format until ES Modules were born. Not that I have a problem with ES Modules, I don't, however I am interested in what was so insufficient about the preceding formats that we couldn't have standardized on them

I think it is absolutely the syntax that needed standardizing. AMD was always a hack for module loading using available browser tech as best as it could and screaming for better syntax. There was so much pain every time working with AMD in making sure that define() wrappers were correct and the list of dependencies correctly matched the names and order of those as parameters of the module's function wrapper. AMD was always in desperate need of an import syntax. (One of the reasons Typescript was built was to provide such an import syntax ahead of ESM standardization. It's why I started using Typescript in the 0.x wilds.)

In many ways ESM were always the natural improvement of the AMD format. One of the things that hung browser standardization in various stages was debates about how compatible to be with AMD. There were multiple attempts and a lot of debate at "Loader APIs" that could be extension points to directly interface classic AMD loaders such as Require.js and the Browser's. Had one of those Loader APIs made the final cut it likely would have been possible to "natively" import legacy AMD directly from ESM.

Loader APIs lost to a number of factors including complexity and I think also the irony that CommonJS won the "bundler war" while those debates were going on. I think it must have seemed that the writing was on the wall that AMD compatibility was no longer that useful and Loader APIs were never going to be great for CommonJS compatibility (again, because of those synchronous assumptions that doomed CommonJS to always be the nemesis of browser modules).

(The dying compromises of the "Loader APIs" tangents is what eventually delivered importmaps.)

AMD compatibility without "Loader APIs" is basically impossible. Even though Require.JS was quite dominant, it was never the only loader, and part of its dominance was it was an extremely configurable loader with tons of plugins. There wasn't an "AMD loader standard" that browsers could emulate.

I generally do think that ESM is what we got trying to fix the syntax needs of AMD and clean up and actually standardize the AMD loader situation. In the end it didn't end up backwards compatible with AMD like it tried to do, but from my impression it certainly tried and that was unfortunately part of why ESM standardization was so slow and what led to such a larger mess of CommonJS modules in the wild in the time that took.