Multiple Digital Ocean services down | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Multiple Digital Ocean services down (status.digitalocean.com)
	115 points by inanothertime 253 days ago

10 comments

showerst 253 days ago

I use DO's load balancers in a couple of projects, and they don't list Cloudflare as an upstream dependency anywhere that I've seen. It's so frustrating to think you're clear of a service then find out that you're actually in their blast radius too through no fault of your own.

miken123 253 days ago

It is mentioned in their list of subprocessors: https://www.digitalocean.com/trust/subprocessors

coreylane 253 days ago

I find stuff like this all the time, railway.com recently launched an object storage service, but it's simply a wrapper for wasabi buckets under the hood, and they don't mention this anywhere... not even the subprocessors page https://railway.com/legal/subprocessors - customers have no idea they are using wasabi storage buckets unless they dig around the dns records. so i have to do all this research to find upstream dependencies and go subscribe to status.wasabi.com alerts etc.

dig b1.eu-central-1.storage.railway.app +short

s3.eu-central-1.wasabisys.com.

eu-central-1.wasabisys.com.

timomeh 252 days ago

Hey, I'm the person that was responsible for adding object storage to Railway. It was my onboarding project, basically a project I was able to choose myself and implemented in 3 weeks in my 3rd month after joining Railway.

Object Storage is currently in Priority Boarding, our beta program. We can and will definitely do better, document it and add it to the subprocessor list. I'm really sorry about the current lack of it. There was another important project that I had to do between the beta release of buckets and now. I'm oncall this week, but will continue to bring Buckets to GA next week. So, just to give this context. There's no intentional malevolence or shadiness going on, it's simply because there's 1 engineer (me) working on it, and there's a lot of stuff to prioritize and do.

It's also super important to get user feedback as early as possible. That's why it's a beta release right now, and the beta release is a bit "rushed". The earlier I can get user feedback, the better the GA version will be.

On the "simply a wrapper for wasabi buckets" - yes, we're currently using wasabi under the hood. I can't add physical Object Storage within 3 weeks to all our server locations :D But that's something we'll work towards. I wouldn't say it's "simply" a wrapper, because we're adding substantial value when you use Buckets on Railway: automatic bucket creation for new environments, variable references, credentials as automatic variables, included in your usage limits and alerts, and so on.

I'll do right by you, and by all users.

wesammikhail 253 days ago

slight off topic: I used DO LBs for a little while but found myself moving away from that toward a small droplet with haproxy or nginx setup. Worked much better for me personally!

showerst 253 days ago

The point of an LB for these projects is to get away from a single point of failure, and I find configuring HA and setting up the networking and everything to be a pain point.

These are all low-traffic projects so it's more cost effective to just throw on the smallest LB than spend the time setting it up myself.

grayhatter 253 days ago

If they are small projects, why are they behind a load balancer to begin with?

nickmonad 253 days ago

Usually because of SSL termination. It's generally "easier" to just let DO manage getting the cert installed. Of course, there are tradeoffs.

showerst 253 days ago

I use the LB's for high availability rather than needing load balancing. The LB + 2 web back-ends + Managed DB means a project is resilient to a single server failing, for relatively low devops effort and around $75/mo.

grayhatter 253 days ago

Are both servers deployed from the exact same repo/scripts? Or are they meaningful different, and/or balanced across multiple data centers?

Did your high availability system survive this outage?

bbss 253 days ago

Regional LBs do not have Cloudflare as an upstream dependency.

jsheard 253 days ago

They don't name names but it's probably due to the ongoing Cloudflare explosion. I know the DigitalOcean Spaces CDN is just Cloudflare under the hood.

matt-p 253 days ago

Just spaces CDN, not spaces - you'd think they'd just turn the CDN off for a bit.

potato3732842 253 days ago

You can't just "turn off CDN" on the modern internet. You'd instantly DDOS your customers' origins. They're not provisioned to handle it, and even if they were the size of the pipe going to them isn't. The modern internet is built around the expectation that everything is distributed via CDN. Some more "traditional" websites would probably be fine.

oasisbob 253 days ago

Might be just me, but I can think of many origins under my control which could live without a (non-functional) CDN for a while.

CDN is great for peak-load, latency reductions, and cost - but not all sites depend on it for scale 24/7

matt-p 253 days ago

If you are DO you could, you just decided not to bother. They control the origins it's spaces (s3), so they could absolutely spin up further gateways or a cache layer and then turn the CDN off.

graemep 253 days ago

Either you are wrong and they do not have the capacity to do that, or they have decided it is acceptable to be down because a major provider is down

I imagine a cache layer cannot be that easy to spin up - otherwise why would they outsource it?

matt-p 253 days ago

You outsource it because clouflare have more locations than you so offer lower latency and can offer it at a cost that's cheaper or the same price as doing it yourself.

tgma 253 days ago

nit: that's more DoS (from a handful of DO LBs) than DDoS.

TechRemarker 253 days ago

Yes all sites showing the CloudFlare error due to the massive outage. Seems their outages are getting more frequent and taking down the internet in new ways each time.

foxyv 253 days ago

Man, it really seems like the cloud providers are having some tough times lately. Azure, AWS, and Cloudflare! Is everything just secretly AWS?

archerx 253 days ago

I have two projects on DO using droplets and they are still running fine.

soheilpro 253 days ago

Droplets are fine.

> This incident affects: API, App Platform (Global), Load Balancers (Global), and Spaces (Global).

hshdhdhj4444 253 days ago

It seems mostly a CludFlare related issue.

My DOs are working fine as well.

sgc 253 days ago

Are you using their "reserved IPs"? I was thinking of starting to use them, but now I wonder if it is part of their load balancing stack under the hood.

giancarlostoro 253 days ago

So yesterday Azure got hit hard, today CF and DO are down, bad week or something else?

watermelon0 253 days ago

Azure DDoS event happened in October. Blog post about the attack was published yesterday, and was quickly picked up by news sites.

matt-p 253 days ago

DDOS, but I don't really understand why in particular.

giancarlostoro 253 days ago

Having known people like this, its either flexing about who has the more powerful botnet or advertising who can do what.

red-iron-pine 252 days ago

NATO testing internal infra, or Russian hackers stepping it up after aggressive sabotage efforts in Eastern Europe?

BubbleRings 253 days ago

I would also like to know people’s opinion on this.

zx8080 253 days ago

Year-end promotion cycle is the worst time for end-users and the best one for engineers greedy for promotions.

Lammy 253 days ago

Don't blame individual engineers who want to do what will be rewarded instead of company performance policies that reward this type of behavior.

fridder 253 days ago

shoot, there are also end of year layoffs and reorgs to pump up those year end numbers

red-iron-pine 252 days ago

what engineers, mate? they AI now

and they're doing just spectacular

igtztorrero 253 days ago

I knew it, DigitalOcean CDN is using Cloudflare behind the scenes. Why DO ?

aforty 253 days ago

Cloudflare outage.

mrkramer 253 days ago

Who is next?

red-iron-pine 252 days ago

my guesses would be look at who has a FedRAMP capable service first.

maybe also GCP, hetzner, akamai

drob518 253 days ago

Dominos falling into dominos falling into dominos…