Hacker News new | ask | show | jobs
by kbenson 4049 days ago
How they purport to do packaging is interesting, but I'm not sure it will work well in the end. Having "bundles" that contain immutable sets of packages sounds good from a stability point of view, but unless they are entirely self contained, you'll undoubtedly run into a library that you need to updated for one bundle that then forces you to update another entire bundle. If each bundle is entirely self contained (allowing it to have it's own set of libraries), you're essentially recreating what's a static binary through package semantics. This comes with the usual downsides of static binaries.

I'm interested in seeing it tried though. The learning is in the doing.

2 comments

Self contained packages are not a new idea. For example, PC-BSD has been doing this for years, via their PBI package format. See the description of PBI here: http://www.pcbsd.org/en/package-management

I think PBI does de-duplication at the package manager level by manipulating hard-links to common files, rather than installing multiple copies.

> I think PBI does de-duplication at the package manager level by manipulating hard-links to common files, rather than installing multiple copies.

Which is, itself, a bad reinvention of Plan 9's Venti filesystem. Having one, or two, or a million files on disk containing the same data should take up as much space as having just one. "Hard links" are a policy-level way to express shared mutability; deduplication of backing storage, meanwhile, should be a mechanism-level implementation detail.

ZFS has support for block-level deduplication and it comes with heavy memory and performance requirements. File-level deduplication with hard links is lightweight and requires no special support (besides a filesystem which supports hard links, obviously).
Well, there is something to be said for the 'worse is better' approach.
Self contained packages are not the problem, individualized libraries shipped along with those packages are.

How does PBI handle minor library version differences then? If one package provides and uses mylib-1.3.1 and another provides and uses mylib-1.3.5, how is that distinguished as the core library level (the plain .so file)? My understanding of what Clear Linux is attempting allows this level of granularity to ensure a package (really an amalgamation of individual packages in the sense of most current unixy distros) is functional and updated as a whole.

I believe that PBI works the same way. It allows different packages to independently use multiple versions of the same lib simultaneously. There is deduplication via hardlinks ONLY when the checksums match.

That's what made it attractive to me: I've painted myself into a corner several times when trying to install Ubuntu PPAs that want conflicting versions of shared libs

I'm almost of the mindset that just having userspace apps as lxc containers with everything an app needs may be better for the most-used applications...

Given how relatively cheap even fairly big SSDs are, is it really worth the storage savings for your browser to share a couple .so files with your mp3 player?

I actually like how PC-BSD pbi packages work... given the number of times solutions have been made to work around the issue and reduce space... I'm not sure it's always worth it. At least not in the desktop space.

Imagine the problem of tracking down all the different versions of a library when an exploit is found. If you have 20, or even 50 different apps that bundled openssl, imagine the hassle of making sure each one was vetted and updated as needed, not to mention the delay in getting all the different packages rebuilt and pushed (which may be a small delay, or may not, depending on the vendor).
You regularly use 20 or 50 end-user applications that use openssl?

I'm not talking the low-level OS applications here... I'm talking end-user applications and major exposed services.

For that matter, each of those applications needs to be updated, vetted and packaged... it's a matter of the level and completeness of packages.

If they are micro-vms, container-style, I don't think they will have such need to share any library? -in theory, at least- ..

I mean, it is possible to completely isolate them, all.

It may end-up very heavy though, but, and I can be wrong on this, with the constant growth of storage capacities, network bandwidth, RAM capacity, and the progress made to lighten "containers", I don't think this "heavy" downside I see of immutable infrastructures will be a real issue in the future.

No, but identifying which of your 20 micro-VMs is susceptible to the next OpenSSL exploit, and rolling out the fixes may be. It's both simpler in some aspects and more complex in others to lave local library versions for every app/service. Managing service prerequisites becomes easy and managing service feature updates becomes easier than it was, but managing service security updates becomes more complex. Juggling these different needs and capabilities is where it gets interesting.
I got your point.

I guess it just lead to a turning point, where end-users won't have to worry about security updates for x or y library, but more about updating the application they're using. In the case you use containers/micro-vms, if there is a security update to do, the container "maintainer" would be in charge to push the security update, then you just need to update your container.

I'm not sure which one is the most constraining, dealing with conflicts or being careful on relating on well maintained "containers".

I guess, for production environments, the second option looks like a wise choice.