|
|
|
|
|
by AtlasBarfed
818 days ago
|
|
If you are going to multi-node Postgres, you need to start planning for Cassandra/Dynamo. That is a BIG lift. Joins don't really practically scale at the Cassandra/Dynamo scale, because basically every row in the result set is subject to CAP uncertainty. "Big Data SQL" like Hive/Impala/Snowflake/Presto etc are more like approximations at true scale. Relational DBMS is sort of storage-focused in the design and evolution: you figure out the tables you need to store the data in a sensible way. They you add views and indexes to optimize for view/retrieval. Dyanmo/Cassandra is different, you start from the views/retrieval. That's why it is bad to start with these models for an application because you have not fully explored all your specific data structuring and access patterns/loads yet. By the time Postgres hits the single node limits, you should know what your highest volume reads/writes are and how to structure a cassandra/dynamo table to specifically handle those read/writes. |
|