| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cosmotic 105 days ago
	Why does the model data need to be stored in the image? Download the model data on container startup using whatever method works best.

2 comments

za_mike157 105 days ago

You are correct! From our tests, storing model weights in the image actually isn't a preferred approach for model weights larger than ~1GB. We run a distributed, multi-layer cache system to combat this and we can load roughly 6-7GB of files in p99 of <2.5s

link

jono_irwin 105 days ago

hey cosmotic, we're not really advocating for storing model weights in the container image.

even the smaller nvidia images (like nvidia/cuda:13.1.1-cudnn-runtime-ubuntu24.04) are about 2Gb before adding any python deps and that is a problem.

if you split the image into chunks and pull on-demand, your container will start much faster.

link

fwip 105 days ago

Just pre-install the NVIDIA layer on the filesystem instead of docker-pulling it for every single machine.

link