I had a similar problem at a past job. Though we only had a PB of data. We used a products called SwiftStack. It is open source, but they have paid support. I recommend getting support, as their support is really good. It is an object store like S3, but it has its own API. Though I think they now have an S3 compatible gateway now.
We had about 25 Dell R730xd servers. When the cluster would start to fill up, we would just replace drives with larger drives. Upgrading drives with SwiftStack is a piece of cake. When I left we were upgrading to 10TB drives as that was the best pricing. We didn't buy the drives from Dell as they were crazy expensive. We just bought drives from Amazon/New Egg, and kept some spares onsite. We got a better warranty that way too. Dell only had a 1 year warranty, but the drives we were buying had a 5 year warranty.
Way late to the discussion, but I second the positive remarks on SwiftStack. It's in the easy button category in this case. The core storage engine of SwiftStack is open source (OpenStack Swift). However, the nice wrap-around tooling and web dashboard is not open source.
I’m not an AWS pricing expert, but you should be aware you’re still on the hook for S3 requests even if you can get out of paying for bandwidth. Is AWS direct connect a pure peering arrangement? I wonder what their requirements are for that. Guess I’ll read the link :)
Idk what your team’s expertise is, but I’d advise avoiding the cloud as long as possible. If you can build out an on-premise infrastructure, it will be a huge competitive advantage for your company because it will allow you to offer features that your competitors can’t.
Examples of this:
- Cloudflare built up their own network and infrastructure and it’s always been their biggest asset. They set the standard for free tier of CDN pricing, and nobody who builds a CDN on top of an existing cloud provider will ever beat it.
- Zoom. By hosting their own servers and network, Zoom is similarly able to offer a free tier where they are not subject to variable costs from free customers losing them money on bandwidth charges.
- WhatsApp. They scaled to hundreds of millions of users with less than a dozen engineers, a few dozen (?) servers, and some Erlang code.
IMO defaulting to the cloud is one of the worst mistakes a young company can make. If your app is not business critical, you can probably afford up to a day of downtime or even some data loss. And that is unlikely to happen anyway, as long as you’ve got a capable team looking after it who chooses standard and robust software.
I run cloud infra for a living. Have been managing infrastructure for 20 years. I would never for one second consider building my own hosting for a start-up. It would be like a grocery delivery company starting their own farm because seeds are cheap.
Depends what you’re doing I suppose. I think the three companies I mentioned (CloudFlare, Zoom and WhatsApp) are good examples of infrastructure investment as a competitive advantage.
None of those are start-ups, though. They've either IPOed (CloudFlare, Zoom) or been acquired by publicly-traded companies (WhatsApp).
A startup is a company that might still need to pivot to find its final business model, potentially shedding its entire existing infrastructure base in the process. Start-ups are why IaaS providers don't default to instance reservations — because, as a startup, you might suddenly realize that you won't be needing that $10k/hr of compute, but rather $10k/hr of something else.
Or suppose you run the most successful/profitable Fantasy Sports League start-up on the internet (used to work for 'em) and host your own gear. Every year you have to analyze trends in use and predict future load, to build the capital needed to buy all new racks of servers every 2-3 years, pay for all the IT staff, datacenter costs.
That was before the cloud existed. They had to poach experts from hosting companies to build and maintain their gear. They built a 24/7 NOC, did server repair, became network experts, storage experts, database experts. Besides being incredibly complex and burdensome, it was financially risky. If they missed their projections they could over-invest by 1-2 million bucks, or even worse, not have the capacity needed to meet demand.
If somebody told us back then that we could pay a premium to be able to scale at any time as much as we needed, when we needed it? We would have flipped out. We had heard about Amazon building some kind of "grid computing" thing, but it seemed like a pipe dream for universities, like parallel computing. Turns out it was a different kind of grid.
WhatsApp ran on bare metal in SoftLayer prior to (and well after) being acquired by FB.
CloudFlare went well beyond leasing servers and built their own POPs with network etc prior to IPO. Much of what they built wouldn't have made economic sense with AWS tax.
I didn't mean to imply that IPOing is the point at which a start-up becomes a not-start-up. None of these three were a start-up for quite a few years before their IPO, either.
In most of these cases, the companies growth from startup to not-startup was only possible because of their infrastructure advantage. Do you think Cloudflare the startup could have offered a free tier if they had to pay Amazon $0.10 per GB that their users sent over the network?
Of course not. But the free tier was a vital component of Cloudflare's growth, first-mover advantage and wide adoption.
> as long as you’ve got a capable team looking after it who chooses standard and robust software.
And cheap.
If you put people in charge who are looking for ways of expanding their empire and budget through spending money on EMC/VMWare/Oracle/etc/etc then you can quickly wind up spending a lot more money.
Simplistic network designs, simplistic server designs, simplistic storage designs with mostly open source software used everywhere can be highly competitive with Cloud services.
Mostly all that Amazon did to create AWS/EC2 was to fire anyone who said words like SAN or EMC and do everything very cheaply using open source software, and evolved away from Enterprise vendors and towards commodity hardware.
If you make "frugality" a core competency in your datacenter design like Amazon did, then you can easily beat the cloud.
You also need to have [dev]ops people who are inclined to say "yes" to the business and who know how to debug things and can operate independently of needing to phone up EMC.
Can you explain more? Because I honestly don't know enough about SANs to know the difference.
To me, a "Storage Area Network" is 1. a cluster of disk-servers, serving the role of exposing logical block-storage over a protocol like iSCSI (whether directly to client machines, or managed and dynamically allocated by hypervisor software like vSphere), where 2. machines are connected to that storage cluster over a dedicated network interface, to keep LAN/WAN packets from contending for throughput with SAN packets.
By that definition, EBS is definitely a SAN. (And technically, so is my two-drive NAS, if I configure it as an iSCSI target and then run a second switch that connects to its second network port and my workstation's second network port.)
Does "SAN" imply some specific internal architecture for the storage cluster or something?
And, if so, then what do you call the type of thing that EBS is?
Back of the envelope calculation you need 30TB raw, so about 60 servers. They aren’t really that power hungry so 10 per cabinet. 6 cabinets. at least 6+2 switches.
Software wise you have lots of options with this infra. High upfront cost but low MRC vs all other options. Assuming you have skilled sys admins who know what they are doing.
We had about 25 Dell R730xd servers. When the cluster would start to fill up, we would just replace drives with larger drives. Upgrading drives with SwiftStack is a piece of cake. When I left we were upgrading to 10TB drives as that was the best pricing. We didn't buy the drives from Dell as they were crazy expensive. We just bought drives from Amazon/New Egg, and kept some spares onsite. We got a better warranty that way too. Dell only had a 1 year warranty, but the drives we were buying had a 5 year warranty.