|
Head of DevOps at a major financial exchange where latency & resiliency is at the heart of our business, and yes, we pay Cloudflare millions. I see two things here: # Just be ready Most definitely not the first time Cloudflare has had trouble, just like any other system: it will fail eventually. If you're complaining about the outage, ask yourself the question: why were not you prepared for this eventuality? Spread your name servers, and use short-TTL weighted CNAMEs, defaulting to say, 99% Cloudflare, 1% your internal load balancer. The minute Cloudflare seems problematic, make it 0% 100% to bypass Cloudflare’s infrastructure completely. This should be tested periodically to ensure that your backends are able to scale & take the load without shedding due to the lack of CDN. # Management practices Cloudflare's core business is networking. It actually embarrasses me to see that Cloudflare YOLO'd a BGP change in a Juniper terminal without peer reviews and/or without a proper administration dashboard, exposing safe(guarded) operations, a simulation engine and co.? In particular, re-routing traffic / bypassing POPs must be a frequent task at scale, how can that not be automated so to avoid human mistakes? If you look at the power rails of serious data centers out there, you will quickly notice that those systems, although built 3x for the purpose of still being redundant during maintenance periods, are heavily safeguarded and automated. While technicians often have to replace power elements, the maintenance access is highly restricted with unsafe functions tiered behind physical restrictions. An example of a common function that's safeguarded is the automatic denial of an input command that would shift electrical load onto lines beyond their designed capacity - which could happen by mistake if the technician made a bad assumption (e.g. load sharing line is up while it's down) or if the assumption became violated since last check (e.g. load sharing line was up when checked, became down at a later time - milliseconds before the input even). |
Which can't be done because it invalids the point of using CloudFlare!
CloudFlare is used to protect your site from DDoS attacks and ransoms. It has to hide the IPs of the servers otherwise attackers will DDoS the servers directly, bypassing CloudFlare.