Hacker News new | ask | show | jobs
by Danack 3221 days ago
We've been having 'fun' with ongoing issues for a site since 6pm UTC yesterday, which got dramatically worse this morning...and having been recurring during the day.

Having multiple hour outages makes me really want to go back to hiring a couple of physical servers in a rack somewhere.

4 comments

The "cloud" is a hilarious failure at the main marketing point. Decentralized just means you have no idea where your servers are.

Scaling is the only real selling-point of cloud and far more people think they need it than actually do

I have no idea where my power or water come from, or where the cell towers are located for my phone, or the satellites they communicate with. I don't want/need to know. Most cloud providers tell you which region of which country you're in, most people don't need to know much more (and if you do, then dont use cloud)
The difference is you're not concerned about where those come from. It doesn't matter whos water it is or where the power is coming from. And they're both simple commodities with few metrics.

We're trying to fit something that's generally very centralized into the same model. Where the servers are does matter. What OS they run and how reliable they are matters on an individual level. The environment your server runs on is quite important and it's definitely your server, not "any server will do" by a Longshot.

If the cloud was just some source of CPU instructions we would have a government regulated source of CPU power for everyone. But depending on what you're running ram size, cache size, network latency, CPU architecture, drive type, endless variables come into play that are all important.

Depending on hardware that you cant control to have metrics you definitely need to control is going to make the system less reliable, and that's what we're seeing now with cloud computing.

I lived in a country where everyone has a power generator in their building. Let's just say the quality of life was significantly lower. This cloud shift is like an unstoppable tidal wave. I'm always surprised when I hear people with your argument. Are you willing to imagine that in a couple years things may change your perspective?
The key is that most services don't need to be reliable. I think the cloud has huge promise here. Engineers tend to think their app needs five-nines reliability when we live in a world where the banks close twice a week.

I don't think on-prem will ever die out. It's like owning vs renting your office. There's pros and cons to each and we'll eventually hit some kind of equilibrium.

My cloud provider tells me which city my server is in and I get to pick the OS. I don't care about the topology of their data-center or what rack I'm in, or even the precise location of the data-center beyond which region it is in. Your needs may differ, if so don't use cloud.
Your power and water, typically delivered via non-diverse paths is way more reliable than the GCP if we are to measure based on the recent outages.
I think it all boils down to how you deploy your stuff. If you think Cloud is so massive that it is never a SPOF, you'll likely not meet your availability aspirations in some time in future.

Cloud to me is also shared risk. I read - "Google cloud outage" as "multiple companies that rely on a shared infrastructure is not available ATM".

The mindset should be to run your services on a distributed infrastructure with no SPOF. Leverage cloud, fog, racks, PCs, whatever resources you can, but diversify your content/service and be risk averse from failure of one kind.

what about experimenting & failing fast and cheap? will you buy servers / rack space / service contract for a year to develop your app when you can experiments with servers on the cloud in cents? public clouds are about muuuuch more than scaling, we can also talk about managed services, agility, payg, etc'
Up to a certain scale I think being completely off the cloud is never out of the question.

But if you're a small-ish team with ambitions of building something that may one day need to scale quickly to support a large influx of traffic / customers (generally unannounced / unplanned), I think it's insane not to have a cloud strategy / presence.

I have never seen research on it, but my hunch is that given that a large number of websites / services end up being impacted simultaneously, it's probably better to be down when everyone else is, than being the only one down.

In my experience when there's a large-scale outage, that information is far more likely to get back to the end user a lot faster, and in a fashion where it may not even impact their perception of your business (ie. maybe they originally experienced the issue on someone else's website / app).

But you can be certain that while you're down when everyone else is up, your potential and existing customers / users are far more likely to blame you and begin searching for alternatives.

but the cloud should have been far more redundant and easier to maintain!
All praise the cloud! Blessed be the cloud!
ಠ_ಠ
And when your load balancers fail you can be the one responsible for fixing it instead of Google. The internet is brittle. You should use servers at different clouds / data-centers and use DNS fail over for specifically this reason. Maybe even use 2 different cloud based DNS providers.
The issue is not that the compute instances are unreliable - the issue is that the super-awesome-magic-dust is not reliable and "cloud" is not the way to use compute, rather it is the way to use the magic dust to get unicorns.
What does that even mean?
Ask people who use or advocate using Google GLB to build a small equivalent of it using regular instances.