Hacker News new | ask | show | jobs
by abathur 805 days ago
I guess there's a hybrid model where you're able to select exactly what you're depending on and pull it in dynamically at build/package time.

I've thought a little about, for example, building something that could slice just the needed utility functions out of a Shell utility library. (Not really for minimizing the dependency graph--just for reducing the source-time overhead of parsing a large utility library that you only want a few functions from.)

Would obviously need a lot of toolchain work to really operationalize broadly.

I can at least imagine the first few steps of how I might be able to build a Nix expression that, say, depends on the source of some other library and runs a few tools to find and extract a specific function and the other (manually-identified) bits of source necessary to build a ~library with just the one function, and then let the primary project depend on that. It smells like a fair bit of work, but not so much that I wouldn't try doing it if the complexity/stability of the dependency graph was causing me trouble?

2 comments

Isn't that already just the role of tree-shaking optimizers? At that point the problem seems to be languages that don't have good tree-shakers, don't/can't tree-shake library dependencies, or maybe that tree-shaking should happen earlier and more often than it often does?

Observably, it seems like the "granularity pendulum" in the JS ecosystem very directly related to the module system. CommonJS was tough to tree-shake so you had sometimes wild levels of granularity where even individual functions might be their own package in the dependency graph. ESM is a lot easier to tree-shake and you start to see more of the libraries that once published dozens or hundreds of sub-packages start to repackage back to just one top-level package alongside ESM adoption.

Perhaps by coincidence, there's a good post on treeshaking+wasm on the front page this morning: https://news.ycombinator.com/item?id=40023319
I imagine the answer's ~yes from the perspective of something you build and deploy (and I agree it's relevant to the article--but I'll caveat that I read xamuel to be asking the question very broadly).

Relying on a post-build process to avoid deploying unused code and dependencies still exposes you to a subset of problems with most if not all of the dependency graph.

Sufficiently-rich correct-by-definition metadata on the internal and external dependencies of each package might let you prune some branches without requiring the dependency to be present, but in the broad there are a lot of cases where that can't really help?

Some package managers have "features" (e.g. Rust's cargo) or "extras" (e.g. Python) which might be what you are talking about.

Of course another solution is to make smaller libraries in the first place, so your users don't feel like they need to break it up.

> Some package managers have "features" (e.g. Rust's cargo) or "extras" (e.g. Python) which might be what you are talking about.

I don't think so (though I agree that mechanisms like this are one way to approach the problem).

AFAIK both of these examples are mostly used to provide ~optional behavior (usually to exclude dependencies if you don't need the behavior). This can minimize the set of dependencies, but it's resting on the maintainers' sense of what the core of their library is, and what's ancillary. Said the other way around, both require the software's maintainers to anticipate your use case and feel like it was a good use of their time to split things up very granularly.

In xamuel's hypothetical of library A using one minor thing from library B, this almost certainly means reusing less of library B than its maintainers anticipate.

I can imagine this working in cases where the package is a true bundle of discrete utilities that almost no one will need all of (the package itself is an incredibly small core/stub and each utility is a feature/extra), and the maintainers want to intentionally design it for modular consumption.

But it's hard to imagine many maintainers going through the work of dicing a cohesive library up into granular units when they think most users will be consuming it whole?