|
|
|
|
|
by fifilura
814 days ago
|
|
Was this before BigQuery/Presto/Trino? To me it seems like those technologies would have been a good fit. They don't really work with indexes but instead regular files stored in partitions (where date is typically one of them). This means that they only have to worry about the data (e.g. dates) that you are actually querying. And they scale up to the number of CPUs that particular calculation needs. They rarely choke on big query sizes. And big tables are not really an issue as long as you query only the partitions you need. |
|
Of course with 20/20 hindsight that decision is easy to criticize. I suspect their primary concerns were to minimize risk and costs while meeting our customer's requirements. Even today, making a brand new Google product or Facebook backed open source project a hard dependency would be too much risk for an established business.