Hacker News new | ask | show | jobs
by jono_irwin 106 days ago
hey cosmotic, we're not really advocating for storing model weights in the container image.

even the smaller nvidia images (like nvidia/cuda:13.1.1-cudnn-runtime-ubuntu24.04) are about 2Gb before adding any python deps and that is a problem.

if you split the image into chunks and pull on-demand, your container will start much faster.

1 comments

Just pre-install the NVIDIA layer on the filesystem instead of docker-pulling it for every single machine.