| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mason55 3489 days ago
	And the idea here is that you have a large parallelizeable problem but you don't need the Hadoop ecosystem? To put it another way, why not just use EMR?

1 comments

hackcrafter 3488 days ago

Yes, no Hadoop.

Quite a lot of life science work is stand-alone programs that are domain-specific and read and write flat-files.

link

jghn 3488 days ago

This. As someone who works on bioinformatics workflows one of the more difficult aspects of my job is to try to explain to other software folks why we do what we do. While scheduling is a solved problem the issue is that you're dealing with a ton of command line tools expecting POSIX filesystems and often not involving parallelization

link

marktangotango 3488 days ago

Also every legacy mainframe Cobol batch system on the planet.

link

mostrowskiatl 3488 days ago

Will AWS Batch support the running of COBOL batch programs?

link