Hacker News new | ask | show | jobs
by ymt123 3579 days ago
It's great to see people talking about the infrastructure they use to manage their deep learning workloads.

One area where we've had trouble with other orchestration tools (e.g. Docker Swarm) was in managing resources at anything beyond whole boxes. They are all good at managing CPU/RAM/Disk but we've had trouble with give this task GPU2. We had planned to try Mesos (given that we already run it for other things) but it sounds like maybe we should take a harder look at Kubernetes first.