Hacker News new | ask | show | jobs
by arcatek 2840 days ago
I'm the author behind the proposal, feel free to ask me any question you might have!

As a personal note, I'm super excited and so grateful to have had the chance to work on this project - node_modules have been a longstanding thorn in the side of the Javascript ecosystem, and to finally have a chance to try something completely new is amazing.

9 comments

This is awesome. I didn't read the full proposal yet, but at first glance it's very close to how we do things where I work (because we needed something long before Node even existed, and never replaced it).

The biggest hurdle we constantly face is tools that make assumptions about the environment (eg: Flow, TypeScript, Webstorm, etc), so changing anything to module resolution breaks everything. We have to spend a lot of time hacking it in every single time. Sometimes the hooks just aren't there (TypeScript has a hook for overriding module resolution, but editors often don't give you an easy way to hook into TypeScript's own hooks).

Any thoughts on how things would work for tools if this was to go forward? Would they all need to be updated to honor this new way to resolve modules?

Flow is already 100% compatible with PnP, through the use of the `module.resolver` configuration settings. We even saw some slight perf increases after switching in.

I guess the same could be done for other tools: the .pnp.js file can be easily interfaced with anything you throw at it without them having to care about the way the dependencies are resolved under the hood. Even non-js tools can simply communicate with it through the cli interface, which returns JSON data ready to use.

This is a bit of a tangent, but do you know of good tutorials, blog posts, videos, or books which help build a mental framework of how javascript packaging works internally?

My goal is to be better at debugging development environment issues.

To be more specific, I am asking this from the perspective of a user of the node ecosystem who has repeatedly had the experience of running into problems with my development environment or packages and not really knowing what to do to get myself un-stuck. I can totally search google/stack overflow/github for an answer and sometimes there is a clear one. Sometimes I fond a command to run, sometimes I don't. In either case, I come away with a strong feeling that I've not actually learned anything. If the same thing happened again, I might remember what to google for next time, but I wouldn't be able to reason clearly about what I'm working with. In the past when I run into these issues, I've just thrown time at them until something sticks. That kinda sucks.

I'd like to find a way to put in some work and come away with a solid understanding. When you were starting this project, how did you go about building up your knowledge of the underlying abstractions that npm/yarn/webpack/etc. use?

When you see a misconfiguration or get a bug report with this project, how do you go about investigating it?

Have you ever seen a really good explanation of a bug/misconfiguration which helps the reader build a solid mental model?

Why is .pnp.js file and the .pnp directory hidden? Most projects are moving away from hidden magic files/dirs I think.
Seems weird that yarn.lock is not hidden but pnp.js is.
I quickly read the paper and I wonder if there's anything in place to deal with packages that generate side effects via postinstall. Can we at least manually create a list of them in package.json or something so that Yarn copies them into node_modules like before?
Postinstall scripts are the main issue, yep. Right now the current implementation doesn't do anything special with them, meaning that they are installed inside the cache (except when they're disabled altogether, which often works well enough since there isn't that many packages that require postinstall scripts).

This is obviously wrong, so we'll soon go to a model where we "unplug" the packages and put them into a specific directory (a bit like the node_modules folder, but entirely flat). The feature itself is already there (`yarn unplug` in the rfc), but we need to wire it to the install process.

Ideally, I think it would be beneficial for the ecosystem to move away as much as possible from postinstall scripts - WebAssembly became a very good solution for native libraries, and as the added benefit that it works everywhere, including the web.

How would you implement https://github.com/ronomon/direct-io with WebAssembly?

Native modules exist because people sometimes need to drop down to work with the platform directly.

Someone working on the wasm will likely be able to tell you more, but from what I understood dynamic linking is on the table[1]. I didn't quite say that it was ready yet, but rather that it was on the right path.

[1] https://webassembly.org/docs/dynamic-linking/

This seems to be solving a problem quite similar to the one of getting browsers to resolve non-relative imports (without file system access, they also need a precise map from module names to paths). Unfortunately, I can't remember who or what was working on that and I'm not sure how much progress it is making, but it'd be wonderful if this could somehow be framed in a way where it also helps that issue along. Are you taking this into consideration in the design?
I think you're referring to the package-name-maps[1] proposal. We've been made aware of it during the middle of the development, and while we chose to continue using the static data tables approach, they aren't incompatible.

The reason we abstracted the static tables behind a JS api is precisely to make it easier for everyone to experiment with various implementations - some might use the internally embedded data as we do, some could consume a package-name-maps file, and some could do something entirely different. As long as the contract described in the document is fulfilled, everything is possible.

[1] https://github.com/domenic/package-name-maps

Hey, I just saw a tweet about this, it looks super exciting. A question I have is how does this work with yarn workspaces? I have a monorepo setup and would love to try this, but I use both yarn workspaces and lerna to run commands across packages. Will nohoist be respected?
Workspaces are working just fine: instead of creating the symlinks we simply register them into the static tables, and resolve them from their actual location on the disk.

Nohoist was a bit of a hack from the beginning (precisely because the hoisting was anything but guaranteed), so some incompatibility might happen for packages that cannot work without.

That said, I'm not too worried: we've tried a bunch of the most popular open-source packages, and they all seemed to work fine - the one issue was on an interaction between create-react-app and eslint, but there's discussions going on to fix that in a clean way.

As a data point, my company is using yarn workspaces where one of the projects is an Electron app, and another has a Node server with native dependencies.

Our nohoist section currently looks like this in order to get things to behave:

  "nohoist": [
    "**/electron/**",
    "**/electron",
    "**/electron*/**",
    "**/electron*",
    "**/canvas-prebuilt/**",
    "**/canvas-prebuilt"
  ]
These are likely going to require auto-ejection due to being native and having postinstall scripts.

Would it make sense to not bother with hoisting of ejected modules when using YPnP, given that YPnP solves the same problems as hoisting in a more global way? We'd be able to get rid of the nohoist section in that case.

The reason I use nohoist is because my monorepo has a react native package and web package inside it, I found that if I hoist react native stuff it doesn't resolve properly with the metro bundler, so I have to nohoist every react native package.
Another question—does this also resolve from directory paths to index.js, add missing extensions, and so on, or does it only map package names to directories? As in, does this get rid of all 'superfluous' file system access during resolution?
It works in two steps: the first step resolve to "unqualified paths" (which are just the paths within the cache without the index.js/extensions resolution). This step is entirely static, no filesystem involved here.

The second step converts the "unqualified paths" into "qualified paths", and is basically the index.js/extensions resolution. We currently access the filesystem in order to resolve them (just like Node), because we didn't want to store the whole list of files within our static tables (fearing that it would become too large and would slow down the startup time because of the parsing cost).

So to answer your question: we get rid of the node_modules folder-by-folder traversal, but decided that the extension check was an acceptable tradeoff. We might improve that a bit by storing the "main" entries within the tables, though, which would be a nice fast path for most package requires.

Makes sense. But I guess precise paths (ending with .js) are recognized and cause no actual file system searching?
Yes, we only do a single stat in those cases (to check whether it's a directory or not).
Looks great! Can we start using it now? Do you have a realistic timeframe for an official integration in Yarn?
We'll see what the community response is, and once we're confident this is what everyone wants we'll merge it into master (PR is already up[1]).

While there isn't an official build at the moment, the code PR includes a prebuilt version of the current branch, and we have a playground repository to experiment with it.

[1] https://github.com/yarnpkg/yarn/pull/6382

[2] https://github.com/yarnpkg/pnp-sample-app

Is this transparent for bundlers?
Since the resolution is new, most of them requires plugins. I already wrote those for the most common projects:

https://github.com/yarnpkg/pnp-sample-app/tree/master/script...

They'll be published as separate packages once we merge the PR.