| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by s17tnet 1384 days ago
	Digging their repos is interesting. They also proposed a docker driver to lazily download image bits as the files in the overlay are accessed [0] from CernVM-FS; they claim significant drop in process start up time. [0] https://indico.cern.ch/event/567550/contributions/2627182/at...

4 comments

siscia 1384 days ago

Yeah I work on that.

The startup time is clearly faster as we don't download the image (especially if compared to download the layers and start the docker image).

The main "trick" is that docker images usually includes a lot of files that are not really accesses during standard operations hence pulling them is not needed most of the time.

link

s17tnet 1384 days ago

Yep, I figured it out. I suppose your images are made of large dataset to crunch, for the most part and a smallish part with the R/python/whatever code do execute.

link

jblomer 1384 days ago

The data is not part of the images. It's only the software. In the vast majority of cases, any particular data processing job requires only a tiny fraction of the available software. For instance, a few hundred MB out of a few tens of GB for a typical LHC application software release.

link

yjftsjthsd-h 1384 days ago

In practice that sounds like an excellent optimization, but in theory it annoys me that we're doing that rather than figuring out how to build better binaries.

link

jakogut 1384 days ago

I work on a platform that handles fleets of edge devices running a linux-based OS, where applications are distributed as container images. Nvidia in particular are rather awful to support, as any users with their hardware inevitably build 10+ GB images, largely composed of libraries and samples they'll never use. Plenty of other users are unaware that they can improve the speed and reliability of their deployments by trimming the fat from their images.

A lot of work goes into properly handling and optimizing the download and distribution of excessively large application images, often on slow and unreliable networks, when smaller is always faster and more reliable.

link

xani_ 1384 days ago

I'd love that for rescue media, just load what you need and mirror rest of the image to RAM in background

link

rkeene2 1384 days ago

AppFS is similar and I already have a Docker container called "rkeene/appfs" on DockerHub.

link

fps_doug 1384 days ago

We developed something similar in-house. For most images it's a notable startup speedup.

link

chvish 1384 days ago

Mind sharing what your in-house solution is? I have been working on something similar with extracted layers on AFS and using Podman’s additional layer store.

link