|
|
|
|
|
by verdverm
2709 days ago
|
|
Some things I might try... 1. Hadoop / HDFS / Spark on an ephemeral cluster with disk snapshots
2. Group 1M ID's into a single file
3. If analysis is once a month, save daily then prep data right before analysis.
4. Consider using Cassandra database
5. Rent a big machine where the data can fit into memory
|
|