Hacker News new | ask | show | jobs
by whoami_whereami 2650 days ago
And how many days of downtime are you willing to tolerate while you are restoring that petabyte of data from your contraption? Let's say you have a 10Gbps internet connection (not cheap) all the way through to the Amazon data center, the data transfer will only take about 12 days per petabyte then.

Getting petabytes of storage isn't the problem, transferring the data back and forth is.

2 comments

This is all true, but it sort of presupposes competence.

Taking a full month to recover a downed social media platform isn't really acceptable, but it's still better than being literally unable to recover it at all. Spending a small fortune to ship hardware to an AWS datacenter and convincing/paying them to load it directly would probably also be worthwhile, when we're talking about simply losing a $500M company. If the claim here about "no backup" is true, it's so profoundly stupid that everything I know about best practices sort of goes out the window. Approaches that any sensible person would consider unacceptably slow and unreliable are still a step up from a completely blank playbook.

(I guess the theory might be that Tumblr is such a trashfire it can't be restored, or would lose so much value in days/weeks of downtime that there's no point in even planning for that. Again, I don't really know how you run cost-benefit analyses when it's not entirely clear the project has benefits.)

you can just colocate that server
And where does Amazon offer colo services? What they offer is Direct Connect at certain (non-Amazon) data centers. That costs about 20k per year for a 10Gb port, ON TOP of the colocation and cross connect fees you are paying at the data center where you want to establish the connection. If you want to bring the restore time down to 12 hours, you need 24 connections (and you need at least as many servers, no single server can handle 240Gb of traffic), so we are now at about 480k+X (large X!) per year per petabyte just for the connections you need in case you have a catastrophic failure (establishing such a connection takes days or even weeks, even if ports are available immediately, so you can't establish the connections "on demand").

That's not even talking about availability, as you are now getting into the realm where it starts to get questionable whether even Amazon has enough backhaul capacity available at those locations so that you can actually max out 50+ 10Gb connections simultaneously.

At this scale there is no “just.”