Hacker News new | ask | show | jobs
by ronack 3488 days ago
This seems like a step backwards to me in some ways. I'd prefer to see them evolve Lambda to support containers, longer jobs, and better workflows instead. I thought we were moving away from EC2 with its slow provisioning, spot bidding, and per-hour billing.
1 comments

I think there is a useful distinction and place for this.

Lambda is for shared-compute. You don't need a dedicated server to run a "function" that takes < 60 seconds and can be called in a stateless manner as an API endpoint.

This is dedicated host compute-heavy batch processing. It's a pain to do this at scale!

I've built systems for running large scale life science embarrassingly parallelizable problems on EC2 and wish I had something like this!

Imagine you have a set of input S3 files, each needs multiple-hours of compute to produce output S3 files. Doesn't seem that hard, until EC2 instances fail, programs crash, etc. etc.

And the idea here is that you have a large parallelizeable problem but you don't need the Hadoop ecosystem? To put it another way, why not just use EMR?
Yes, no Hadoop.

Quite a lot of life science work is stand-alone programs that are domain-specific and read and write flat-files.

This. As someone who works on bioinformatics workflows one of the more difficult aspects of my job is to try to explain to other software folks why we do what we do. While scheduling is a solved problem the issue is that you're dealing with a ton of command line tools expecting POSIX filesystems and often not involving parallelization
Also every legacy mainframe Cobol batch system on the planet.
Will AWS Batch support the running of COBOL batch programs?