|
|
|
|
|
by polskibus
3145 days ago
|
|
How does dremio differ from PrestoDB? As far as I know, PrestoDB can also virtualize access to many data sources and join data between them. We didn't go deep with PrestoDB because our basic tests for multi-source joins ran very slowly, and it seemed to pull all data from both joined tables into one place. I'm not a Prestodb expert, so maybe there's a better way to do it (all suggestions welcome). What's the differentiator? Is dremio smarter somehow and avoids copying all data to perform a simple join? Or does it copy the data the same way but Arrow lets it be faster than Presto? What's on your roadmap? |
|
At the core of this vision are: very advanced pushdowns (far beyond other OSS systems), a powerful self-service UI for managing, curating and sharing data (designed for analysts, not just engineers) and--most importantly--the first open source implementation of distributed relational caching for all types of data. You can see more details about this last part in a deck I presented at DataEngConf early today: https://www.slideshare.net/dremio/using-apache-arrow-calcite...