Hacker News new | ask | show | jobs
by bradleyland 5472 days ago
I'm really happy you responded :) My critique was more of the decision making process than the product, so I'm really happy to have the feedback. I try to always assume that my thinking is flawed, so after I read your post, I tried to go back and read my own as if someone else wrote it. I know that sounds silly, but it often leads to insights I wouldn't otherwise have.

You make a good case for the complexity, which is one of the factors I consider. One of my unspoken concerns is that an open source library like Socket.io is going to get stale. I keep an eye on the commit log, and it looks like development is pretty active, but a look at the impact graph shows that there was a lull, and it looks like development might be changing hands. This is par for the course with open source projects, but I question whether "par for the course" is best for our project.

My other big concern is running "yet another server". We're currently running a very standard Rails stack on VPS infrastructure. We're not on a full-blown "cloud" provider, because I don't trust them. Heroku has had too much downtime for our company. AWS offers multiple AZ redundancy, but there's an inherent complexity there that we're not ready to tackle.

Unlike most startups, our revenue generating traffic isn't spread out over thousands of visits. Our revenue is made during a purchasing event that runs for a half hour. Some days, there might only be a single event. If our infrastructure is down for that event, we stand to lose $25k-$50k. I know we're not the only one with this problem (some would lose millions over the same time period), but we're a bootstrapped company. A major outage like the ones Heroku has experienced could put a serious hole in our cashflows.

Because of this, we run a very straight forward Rails hosting stack (Apache/Passenger/MySQL) on a straight forward Xen VPS host at data centers in Dallas and NYC. I fully expect that at some point in the future, one of these data centers is going to disappear off the net and we'll have to be ready. If our entire hosting provider disappeared, I could bring up every one of our customer app instances in a half-day. I know this because I've done the drills using local VPS builds from scratch, and we use third-party, off-net backup locations.

Any XaaS product we integrate in to our core product would have to demonstrate a similar focus to uptime and redundancy. The recent AWS outage that took several high-profile websites offline is a good example. Lot's of seemingly smart companies hadn't deployed across multiple AZs because of the complexity. I can't partner with a company that makes those kinds of decisions.

I know you don't want to try to sell me Pusher, but I'm genuinely curious about Pusher's commitment to uptime. I'd love to see a blog post, or knowledge base article that talks about Pusher's redundancy and disaster recovery plans. I'd love to "just trust" that XaaS companies have this sort of thing covered, but history has taught me otherwise.

Thanks again. I really appreciate the response.

1 comments

You are smart to ask these questions. Reliability is a serious topic that http://www.pubnub.com fits in. I work at PubNub and believe it is the best offering for reliability and uptime being a globally distributed service. During the EC2 outage on the East Coast USA, we lost our servers there. However! Our automated redundant routing system transferred traffic to the surrounding nearby healthy data centers. Customers didn't notice the switch, and went about their day.