|
|
|
|
|
by Gepsens
575 days ago
|
|
I remember 2 years ago someone proposed adding stream processing in datafusion and PRs followed. But IMO stream processing is an entirely different beast, some people could use the sql engine of df for it though. There are rust projects like Arroyo |
|
Our approach has been to take pieces of DF (including the SQL frontend and expression engine) but embedding them in our own dataflow and operators. This allows us to support low latency, distribution, watermark processing, and consistent checkpointing.
But the great thing about DF is that it’s designed as a toolkit for SQL-oriented data processing, so it’s relatively easy to pick and use just the pieces you need.