|
|
|
|
|
by sgarland
159 days ago
|
|
Disagree, though in practice it depends on the query, cardinality of the various columns across table, indices, and RDBMS implementation (so, everything). A simple equijoin with high cardinality and indexed columns will usually be extremely fast. The same join in a 1:M might be fast, or it might result in a massive fanout. In the case of the latter, if your RDBMS uses a clustering index, and if you’ve designed your schemata to exploit this fact (e.g. a table called UserPurchase that has a PK of (user_id, purchase_id)) can still be quite fast. Aggregations often imply large amounts of data being retrieved, though this is not necessarily true. |
|
And many important datasets never make it into any kind of database like that. Very few people provide "index columns" in their CSV files. Or they use long variable length strings as their primary key.
OP pertains to that kind of data. Some stuff in text files.