|
|
|
|
|
by logicfiction
1238 days ago
|
|
That's a fair criticism in the edit. Part 2 will cover that a bit more. I did run analysis on the types of queries users ran against the data and what parts of the timeseries were used, which informed a bit of our solution. I don't want to give away too much, but lifecycle retention adjustment ends up being relatively lower value (but still worthwhile) compared to general space savings. |
|
Are you able to reconcile some of the numbers and calculations in the article for me? (Understanding that you don't want to reveal any confidential info.) I see:
- 31 PB data + 10 PB application logs = 41 PB logs (uncompressed json) costs 7-figures (say ~$5M)
- 41 PB logs * 5% ORC compression = ~ 2 PB logs (compressed ORC) costs low 6-figures (say ~$300k)
I don't know what timeframe that cost is measured over. But that brings us to $300k / 2 PB = $0.15 / GB which is far above S3's quoted costs so I must be missing something.