Hacker News new | ask | show | jobs
by kirillseva 3122 days ago
My org is looking to move machine learning to batch as the underlying infra.

All I want is to be able to do this: 1) specify a DAG of tasks. Each task is a docker image, CMD string, CPU and memory limits 2) hit an API to run it for me. Each task runs on a new spot instance 3) be able to query this service about the state of the DAG and of each individual node

Sounds like if AWS provides an API to create a batch cluster (or whatever you call it) and lets the tasks be defined in terms of what docker image to run with what command you'll satisfy this desire

1 comments

That is in line with our vision for Batch; to be the engine for systems where you essentially describe a DAG and we run and hyperoptimize the execution for you. We do some of what you’re asking for but that’s great feedback around what you’d like to do.