I never played around with nvidia-docker. Is there much, if any additional overheard to the gpu compute by running through it vs. natively with an AMI?
nvidia-docker is just docker that configures the docker to see the physical gpu device, and also mounts the run-time drivers that match the hardware driver that is installed on the host. It makes life way easier if you need portability across different Linux hosts for GPU-driven software. The Nvidia container images also have some of the "proprietary" free software packages like cuDNN which accelerate some popular deep learning packages with optimized CUDA implementations.
If you run on a bare-metal platform, like your laptop booting linux or a bare-metal cloud, there's a very small amount of overhead for using nvidia-docker, mainly just the same overhead as running a regular docker container (a container is just chroot + Linux kernel cgroups + cgroup kernel namespaces).
If you're in an AMI on AWS, it's a virtual machine anyway, so there's virtualization overhead which is quite a bit higher than container overhead, but there's other baggage as well such as shared tenancy/noisy neighbors, and possibly oversubscription of hardware to the virtualized environment.
If you're in a docker container in an AMI, there's the slight container overhead plus the virtualization overhead and baggage and benefits that come with it. Virtualization overhead is probably an order of magnitude higher than container overhead. Natively with an AMI is not so native (though Amazon is trying to improve that with their C5).
>If you're in an AMI on AWS, it's a virtual machine anyway, so there's virtualization overhead which is quite a bit higher than container overhead, but there's other baggage as well such as shared tenancy/noisy neighbors, and possibly oversubscription of hardware to the virtualized environment.
Can you post or cite something related to virtualization overhead being "probably an order of magnitude higher than container overhead"
If you run on a bare-metal platform, like your laptop booting linux or a bare-metal cloud, there's a very small amount of overhead for using nvidia-docker, mainly just the same overhead as running a regular docker container (a container is just chroot + Linux kernel cgroups + cgroup kernel namespaces).
If you're in an AMI on AWS, it's a virtual machine anyway, so there's virtualization overhead which is quite a bit higher than container overhead, but there's other baggage as well such as shared tenancy/noisy neighbors, and possibly oversubscription of hardware to the virtualized environment.
If you're in a docker container in an AMI, there's the slight container overhead plus the virtualization overhead and baggage and benefits that come with it. Virtualization overhead is probably an order of magnitude higher than container overhead. Natively with an AMI is not so native (though Amazon is trying to improve that with their C5).