Hacker News new | ask | show | jobs
by tlipcon 3800 days ago
Todd from the Apache Kudu (incubating) team here. I'll check this thread throughout the day in case there are any questions (and try to check the original post for comments as well).
3 comments

Does Kudu colocate data from different tables with equal keys? If not, is this or a similar feature on the road map?
It doesn't yet. It's on our nebulous "we'd like to do this some time" roadmap, but currently concentrating on some more basic stuff around stability and time series features.

Of course this is a huge optimization for data warehousing applications, where two co-partitioned tables can be joined without any network data transfer, and in some cases could even use merge join instead of hash based strategies. But, it's the usual time/scope/quality trinity, and we'd rather not compromise the third element.

Glad to hear this is at least being considered. The optimizations for data warehousing you mentioned are my use case. I understand the it is a very active project with a lot on the road map. It's a very cool project and I follow you guys on http://gerrit.cloudera.org/#/q/status:open
Also worth noting it's an open source project so if you're interested in contributing in this area, we'd love to have you on board.
You can also join us on Slack here if that's more your style: http://getkudu-slack.herokuapp.com
Still haven't congratulated you guys. Kudu is what I always wanted in a datastore.
Thanks, Alex! Appreciate the kind words.
Seriously, everything we were trying to achieve with c5, and more. I just wish the c++ was more modern, but I know when the project started.
We've just updated to C++11 as of a couple weeks ago. Partially the issue was when it started, partially the issue is that we have to support older platforms and we have a C++ client library to worry about.