|
|
|
|
|
by gane5h
4094 days ago
|
|
Really cool write up – thanks! First time I’m hearing about CitusDB. They appear to be building a columnar, distributed database while preserving the Postgres frontend (similar to redshift, aster, greenplum, etc.) It’s all in the details. I’m planning to investigate the following during my next weekend hack. Hope somebody can answer some pre-sales questions for me: - how complete is the postgres functionality (e.g.: lateral joins)
- can you set a sharding key to control the shard distribution
- does the database do multiple passes for queries with subselects
- usually one increases the replication factor (limited by budget) to improve query times, with the limitation that it slows down loading time. does the DB stage intermediate writes to batch them, so does the user need to do this? this works really well for append-only, timestamped event data.
- do you have a job manager or scheduler, needed when you have multiple views that need to be updated without melting your infrastructure
- how easy is it to operate? does the database expose operational metrics so that you can see the load on each shard to potentially detect unbalanced shards?
- tips on hardware configuration (big advantage of redshift here is that you don’t have to run your own warehouse.) maybe partner with MongoHQ?
It’ll be nice to see some sample query plans graphically visualized. |
|