Hacker News new | ask | show | jobs
by seized 1835 days ago
Glacier pricing can be hard to grok...

With Glacier as it is usually used you dont read data directly from the Glacier storage, it has to be restored to S3 where you then access it. That is the restore charges and the delays, so you can pay a low rate for the bulk option that takes up to 24 hours to restore your data to S3. But the real cost is the bandwidth from S3 back to your NAS/datacenter/etc, which brings it up to about $90USD/TB.

Other fees would include request pricing, some low amount per 1000 requests. So costs can go up a bit if you store 1 million small files to Glacier vs 1000 large files. There is also a tipping point (IIRC about 170KB) where it is cheaper to store small files on S3 than Glacier.

Depending on your data and patterns it can be better to use Glacier as a second backup which is what I do. All my data is backed up to a Google Workspace as that is "unlimited" for now. The most important subset (a few TB) also goes to Glacier. Glacier is pay as you go, there isnt some "unlimited" or "5TB for life" type deal that can change. If Google Workspace ever becomes not "unlimited" or something happens to it, I have the most important data in Glacier and its data that I have no qualms paying >$1k to get back.

But for me restoring from Glacier means that my NAS is dead (ZFS RAIDZ2 on good hardware) and Google Workspace has failed me at the same time.

2 comments

Cool, thank you for the details. None of their marketing or FAQs for Glacier mention that getting the data back means going to S3 first and then paying S3's outgoing bandwidth costs. As deceptive as I expected.

I'll check out Google Workspace; that sounds like the right level of kludge for me, since this is the first time I've ever bothered to try to setup off-site backups. I only started using RAID a couple of years ago.

It makes more sense sense when you think of Glacier as a tier of S3, like Infrequent Access/etc which it is now. There used to be a Glacier standalone service but you had to upload a blob and track the UID to file name mapping yourself. That skipped S3 but was far more complex.

Workspace make more sense for large amounts of data where you can take advantage of the "unlimited".

Backblaze B3 might be what you are looking for as an in between option.

Are you sure about the "restored to s3" bit? Their SDK seems to fetch directly from Glacier.

Note that the official name is "S3 Glacier", so from AWS's public perspective, it is S3.