Hacker News new | ask | show | jobs
by nilsb 702 days ago
Who needs testing when apologizing to your customers is cheaper?
6 comments

Reputational damage from this is going to be catastrophic. Even if that’s the limit of their liability it’s hard not to see customers leaving en masse.
Ironically some /r/wallstreetbets poster put out an ill-informed “due diligence” post 11 hours ago concerning CrowdStrike being not worth $83 billion and placing puts on the stock.

Everybody took the piss out of them for the post. Now they are quite likely to become very rich.

https://www.reddit.com/r/wallstreetbets/s/jJ6xHewXXp

That user is the equivalent of using a screwdriver to look for gold and succeeding.
Not sure what material in their post is ill-informed. Looks like what happened today is exactly what that poster warned of in one of their bullet points.
Yea, everyone is dunking on OP here. But they essentially said that crowdstrike's customers were all vulnerable to something like this. And we saw a similar thing play out only a few years ago with SolarWinds. It's not surprising that this happened. Ofc with making money the timing is the crucial part which is hard to predict.
A convenient alibi?
The company will perish, there is no doubt in that.
Nah they'll be fine. It happened 7 months ago on a smaller scale, people forgot about that pretty quickly.

You don't ditch the product over something like this as the alternative is mass hacking.

Is the alternative "mass hacking"? I thought all this software did was check a box on some compliance list. And slow down everyone's work laptop by unnecessarily scanning the same files over and over again.
I assume you're not in Sec industry?

This sounds like someone who said "dropbox ain't hard to implement"

As someone said earlier in these comments the software is required if you want to operate with government entities. So until that requirement changes it is not going anywhere and continues to print money for the company.
But then, if what you say is true and their software is indeed mandatory in some context, they also have no incentive or motivation to care about the quality of their product, about it bringing actual value or even about it being reliable.

They may just misuse this unique position in the market and squeeze as much profit from it as possible.

The mere fact that there exists such a position in the market is, in my opinion, a problem because it creates an entity which has a guaranteed revenue stream while having no incentive to actually deliver material results.

If the government agencies insist on using this particular product then you're right. If it's a choice between many such products than there should be some competition between them.
Surely there are more than one anti-virus that can check the audit box?
From experiencing different AV products at various jobs, they all use kernel level code to do their thing, so any one of them can have this situation happen.
Extremely unlikely. This isn't the first blowup Crowdstrike has had; though it's the worst (IIRC), Crowdstrike is "too big to fail" with tons of enterprise customers who have insane switching costs, even after this nonsense.

Unfortunately for all of us, Crowdstrike will be around for awhile.

Businesses would be crazy to continue with Crowdstrike after this. It's going to cause billions in losses to a huge number of companies. If I was a risk assessment officer at a large company I'd be speed dialling every alternative right now.
Cybersecurity industry has regular and annual security testing/competitions done by various Organizations that simulates tons of attacks.

Vendors are tested against these cases and graded with their effectiveness.

I heard Crowdstrike is "best-in-market" for good reasons as others who have more deep knowledge of the industry have shared in this thread.

> I heard Crowdstrike is "best-in-market"

A friend of mine who used to work for Crowdstrike tells me they're a hot mess internally and it's amazing they haven't had worse problems than this already.

it would be crazy not to at least investigate migration paths away from Crowdstrike, or better redundancies for yourself
While it probably should, I regret to inform you that SolarWinds is still alive and well.
I mean, Boeing is still around...
I would assume that its enterprise customers have an uptime SLA as part of their contract, and that breaching it isn't very cheap for Crowdstrike.
I highly doubt their SLA says something about compensating for damages. At most you won't have to pay for the time they were down.

And even more ironically; A botched update doesn't mean they are down. It means you are down. So I don't even think their SLA applies to this.

Yeah, they'll pay with "credits" for the downtime, if what is currently happening even technically qualifies as downtime.
Software doesn't have uptime guarantees. They might have time-to-fix on critical issues, though.

I assume this is gross negligence, which would leave them open to claims made through courts, though.

As at 4am NY time CRWD has lost $10Bn (~13%) in marketcap. Of course they've tested, but just not enough for this issue (as is often the case).

This is probably several seemingly non consequential issues coming together.

I'm not sure why though, when the system is this important that even successfully tested updates aren't rolled out piecemeal though (or perhaps it has and we're only seeing the result of partial failures around the world)

Testing is never enough. In fact, it won't catch 99% of issues by the virtue of them often testing happy paths only, or that they test what humans can think of, and by no means they are exhaustive.

A robust canarying mechanism is the only way you can limit the blast radius.

Set up A/B testing infra at the binary level so you can ship updates selectively and compare their metrics.

Been doing this for more than 10 years now, it's the ONLY way.

Testing is not.

Depends on what you mean by enough. It should be more than enough to catch issues like this one specifically.

If they can't even manage that they'll fail at your approach as well.

Canary offers more bang for the buck, and is much easier to set up. So I kind of disagree.
> Canary offers more bang for the buck

I'm not sure that justifies potentially bricking the devices of hundreds(?) of your clients by shipping untested updates to them. Of course it depends... and would require deeper financial analysis.

They won't be able to test exhaustively every failure mode that could lead to such issues.

That's why canaries are easier and more "economical" to implement and gives better value per unit effort.

Exactly. They knocked half the world offline probably killed thousands in ERs and the stock is only down to about June lows.
And when it’s more costly for customers to walk back the mistake of adopting your service.

Yeah, I get the impression a lot of SaaS companies operate on this model these days. We just signed with a relatively unknown CI platform, because they were available for support during our evaluation. I wonder how available they’ll be when we have a contract in place…

hah that tweet was one heck of an apology. "we deployed a fix to the issue, speak with your customer rep"
Unfortunately cybersecurity still revolves around obscurity.