Hacker News new | ask | show | jobs
by tmoertel 4749 days ago
I'm not sure I was able to get my point across to you. Let me try another approach.

The improvement I'm talking about occurs upsteam of the distributions, even though it is caused by the distributions' packaging policies.

Libraries are upstream from projects, and projects are upstream from distributions. If the distributions discourage projects from bundling libraries, this policy will encourage project developers to talk to the upstream library developers to get desired changes into the libraries, rather than go the customize-and-bundle route. This improved coordination and patch-flow benefits the users of the libraries and the users of the projects, regardless of whether those users rely on any particular distribution to get the software. Users are, as always, still free to pick whatever distribution best suits their preferences, or no distribution at all. Still, they benefit from the distributions' debundling policy.

1 comments

I might be able to comment better if the term "projects" were better defined. It just depends on, well, the nature of the dependencies. Most libraries depend on other libraries. Often at least three layers deep.

If you could explain how your philosophy would deal with, for example, nginx and Apache both depending on libssl, which itself depends on libcrypto, which depends on libz and libc (both of which are also separate independent dependencies of Apache and nginx) then maybe we could discuss it better.

Oh, and in theory I should be able to swap libssl for libgnutls arbitrarily. How do we handle that?

You're conflating the dependency graph with the change-flow network. The first represents how projects rely on other projects; the second represents how changes must propagate to reach all users. Once you understand the difference, you'll understand why debundling is the sensible response to the sea of large-scale interdependent software-development projects that characterizes most FOSS ecosystems.
If I understand you correctly, you're saying the dependency graph and the "change-flow network" are completely orthogonal.

Separation of concerns is a value of good software projects. But there are practical realities that the author of the article enumerates specifically.

If there is a tight coupling between his application and a handful of upstream libraries, packagers are far more likely to break his application by distributing the latest version of that shared library. Other applications that aren't as tightly coupled can handle that upgrade. Since it is tightly coupled, he's going to be highly attuned to the upgrade needs for his specific statically compiled version.

They're not orthogonal; they're two directed graphs with the same vertices and different edges.

If the dependency graph G = (V, E) has a vertex for every software project and an edge x -> y iff downstream project y depends on upstream project x, then the change-flow network is the graph C = (V, F), where there is an edge x -> y in F iff there is a downstream path between x and y in G and also y requires an update and re-release when x changes (e.g., because it bundles a copy of x in its releases).

So if there is a change to project x, for it to flow to all affected dependents, you must update all downstream neighbors of x in the change-flow network C.

For example, consider the following dependency graph, in which library L is used by downstream library L2, and L2 by project P:

    L -> L2 -> P
If none of the projects bundle their upstream dependencies in their own releases, then the corresponding change-flow network has no edges, and updating any project requires only re-releasing its own package to satisfy all dependencies:

    L
    L2
    P
But if L2 bundles a copy of L, and P bundles a copy of L2, then the corresponding network looks like this:

    L -> L2
    L -> P

    L2 -> P

    P
A change to L requires re-releasing not only L but also L2, and P. A change to L2 requires re-releasing L2 and also P.

Does that make more sense now?

You seem to misunderstand how static linking works.

If P statically links to its own version of L2, then L2 is just a part of P. The fact that there may be a dynamically linked version of L2 elsewhere on the system is irrelevant.

Consider:

  L -> L2 -> P

  L -> L2 -> Q

  L -> L2 -> R
If the authors of L2 release a new version that P and Q are happy with, but creates an extremely subtle segfault condition in R, then what?

The packager could just wait to release the upgrade to L2 until all downstream packages have compatible releases.

The packager could backport a subset of the L2 patches that is still compatible with R (Redhat does this a lot).

The packager could silently curse the author of R for not statically linking the necessary frozen-in-time version of L2 and thus bypassing this problem entirely.

> If P statically links to its own version of L2, then L2 is just a part of P. The fact that there may be a dynamically linked version of L2 elsewhere on the system is irrelevant.

No, it's highly relevant because when a security fix lands for L2, it takes longer to propagate to users if projects like P bundle their own versions of L2 as part of their releases. In that case, users must wait for the project developers to work the already-released L2 fixes into their own bundled versions of L2 and then release new versions of the projects before any downstream users get the fix. But if P and other projects use the same version of L2 that everybody else does, everybody gets the fix right away.

> If the authors of L2 release a new version that P and Q are happy with, but creates an extremely subtle segfault condition in R, then what? ...

> The packager could silently curse the author of R for not statically linking the necessary frozen-in-time version of L2 and thus bypassing this problem entirely.

More likely, the packager would patch L2 to fix the problem with R and then talk to the upsteam L2 developers to get the patch included in L2 proper. This way, R's users get the fix right away and the problem gets eliminated at its source, in L2, rather than papered-over in R's private copy of L2.

As I wrote in my original post, one of the big benefits of the "no bundling" policy is to make sure that patches flow upsteam to where they belong instead of piling up in downstream repos where they do good for only one dependent project instead of all dependent projects.

I mean, sure, just typeset it in LaTeX and it'll breeze right past your Fortune 500 IT department's change management board.
Meanwhile, organizations too small for a change management board can outsource that function to Debian.