|
|
|
|
|
by benesch
2313 days ago
|
|
Considerations are completely different in a streaming context. It’s not so much about how fast you can churn through terabytes of data; it’s more about how quickly you can turn around the incremental computation with each new datum. There’s some serious research behind this product, in timely and differential dataflow, and I’d encourage you to check out some of that research before making sweeping performance claims. Frank’s blog post on TPC-H is a good place to start: https://github.com/frankmcsherry/blog/blob/master/posts/2017... We definitely have some performance engineering work to do in Materialize, but don’t let the lack of vectorization scare you off. It’s just not as important for a streaming engine. |
|