|
|
|
|
|
by sinisa
3743 days ago
|
|
Scio author here. A bit background:
Spark and Flink are both frameworks with their own execution engine. Scalding is tightly coupled with Cascading + Hadoop as it's execution engine (also tez WIP).
Dataflow Java SDK/Apache BEAM on the other hand is designed to be a simple abstraction with pluggable engines and Cloud Dataflow service is just one of the many runners possible. Right now there are: - local runner - Dataflow runner, fully managed service in GCP - Spark runner - Flink runner Scio wraps Dataflow Java SDK(Apache BEAM) and can potentially leverage any runner available. |
|