Hacker News new | ask | show | jobs
by whynotminot 703 days ago
Canary deployments are already an industry accepted practice and it’s shocking Crowdstrike apparently doesn’t do them.
1 comments

Which industry? Cybersecurity or Cloud software?
Any industry that wants to reliably deliver software that doesn’t brick systems at scale? I’m confused by your question.

Are you telling me the cybersecurity scene is special and shouldn’t follow best practices for software deployment?

Canary deployment for subset of Salesforce customers won't see much of revolt from customers compare to AV definition rollout (not software, but AV definition) in Cybersecurity where gaps between 0day and rollout means you're exposed.

If customers found out that some are getting roll out faster than the others, essentially splitting the group into 2, there will be a need for customer opt-in/opt-out.

If everyone is opting-out because of Friday, your Canary deployment becomes meaningless.

Any proof that other Cybersecurity vendors do Canary deployment for their AV definition? :)

PS: not to say that the company should test more internally...

Canary deployment doesn’t necessarily mean massive gaps between deployment waves. You can fast-follow. Sure, there may be scenarios with especially severe vulnerabilities where time is of the essence. I’m out of the loop if this crowdstrike update was such a scenario where best practices for software deployment were worth bypassing.

If this is just how they roll with regular definition updates, then their deployment practices are garbage and this kind of large scale disaster was inevitable.

Let's walk this through: Canary deployment to Windows machines. If those Windows machines got hit with BSOD, they will go offline. How do you determine if they go offline because of Canary or because of regular maintenance by the customer's IT cycle?

You can guess, but you cannot be 100% sure.

What if the targeted canary deployments are Employees desktops that are OFFLINE during the time of rollout?

>I’m out of the loop if this crowdstrike update was such a scenario where best practices for software deployment were worth bypassing.

I did post a question: what about other Cybersecurity vendors? Do you think they do canary deployment on their AV definitions?

Here's more context to understand Cybersecurity: https://radixweb.com/blog/what-is-mean-time-to-detect

Cybersecurity companies participate in Sec evaluation annually that evaluates (measure) and grade their performance. That grade is an input for Organizations to select vendors outside their own metrics/measurements.

I don't know if MTTD is included in the contract/SLA. If it does, you got some answer as to why certain decision is made.

It's definitely interesting to see Software developers of HN giving out their 2c for a niche Cybersecurity industry.

> You can guess, but you cannot be 100% sure.

I worked in the cyber security space for a decent chunk of my career, and the most frustrating part was cyber security engineers thinking their problems were unique and being completely unaware of the lessons software engineering teams have already learned.

Yes, you need to tune your canary deployment groups to be large and diverse enough to give a reliable indicator of deployment failure, while still keeping them small enough that they achieve their purpose of limiting blast radius.

Again, if you follow industry best practices for software deployment, this is already something that should be considered. This is a relatively solved problem -- this is not new.

> I did post a question: what about other Cybersecurity vendors? Do you think they do canary deployment on their AV definitions?

I think that question is being asked right now by every company using Crowdstrike — what vendors are actually doing proper release engineering and how fast can we switch to them so that this never happens to us again?