| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bslatkin 5704 days ago
	What I mean by this is that we're not doing the same thing as Cascading (http://www.cascading.org/), which requires you to transform your problem into the tuple-space domain. Stream processing frameworks like Cascading are for green-field implementations that maximize incremental performance. On the other hand, the Pipeline API is task oriented. Developers use it with a procedural approach. The focus is on parameter and return value passing and scheduling. It's easy to reuse your existing code in this framework. Think of it as something closer to a parallelizable Bash than a data processing framework.