Hacker News new | ask | show | jobs
by bwang29 3096 days ago
Quote from article "We need to move faster. Validation at Intel is taking much longer than it does for our competition. We need to do whatever we can to reduce those times… we can’t live forever in the shadow of the early 90’s FDIV bug, we need to move on. Our competition is moving much faster than we are".

Competition pressure could make a company's new product worse than (in this case, less stable than) their previous products, e.x. Samsung phone explosion. I still remembered the story was Samsung wanting to release their phone ahead of iPhone and I would imagine the testing went through a similar stressful time as Intel.

Of course not all cases of taking such risks would lead to disasters - just imagine Intel rushes on releasing new chips ahead of competition and 99 out of 100 times it ended up performing well. But a unique character in Intel's case is these bugs, unlike a faulty battery design, are accumulative and additive to future product development, which means a few small wins in catching up with your competitor could also lead to massive failures in some next major battle.

Now imagine Intel's competitors are going through the exact same scenario. One possible outcome is both Intel and its competitors' products become less stable and more buggy over time, and until everyone's stuff seems to be broken they probably never have time to fix them.

4 comments

> "we can’t live forever in the shadow of the early 90’s FDIV bug"

There is a valid point there though - if you are testing for testing's sake and not finding anything extra through the extra effort then you are wasting time and potentially worse: lulling yourself into a false sense of security. Testing should be done for utility, not just in response to fear - you need to test intelligently, not just test lots. Like TTD in software, good testing processes make life much easier and quality much higher, bad testing processes can be worse than useless.

Processor bugs are always a thing and always have been a thing - look at the list of bugs the linux kernel scans for and works around many of which pre-date the FDIV debacle.

What made FDIV special isn't that is was a bad bug, it was the recent change in marketing. Before then processors were sold to manufacturers who might tell the customer what is used, unless you were a hobbyist you didn't much care for the specifics. But the Pentium line was the first time a processor had been particularly marketed directly at the end user. It had started with the 486 lines a couple of years earlier when "Intel Inside" was first a thing, but there was a huge push in that direction with the release of the first Pentium lines. Suddenly Joe Public was more aware of that detail, but was blissfully unaware that CPUs are complex beasts and generally not 100% perfect.

It didn't help that the bug was very easy to demonstrate in common applications like Excel so Joseph & Josephine Public could see and understand the problem where they wouldn't, for example, the FOOF bug, and it was easy to joke about (We are Pentium of Borg. Division is futile. You will be approximated) which fanned the rapid spread of the news. The fact that the bug only significantly affected fairly rare combinations was lost in the mass discussion about how such a bug could happen at all.

Testing is not for finding bugs, testing is for preventing bugs. But otherwise you are right, it's hard to calibrate and further develop the testing procedure if you rarely find bugs. While you might wide awake on one eye you might be blind on seven others. It's usually the things you don't expect that kill you. So, some need to be paranoid, be very paranoid.
The current chip bug does look like a doozy...

See https://www.theregister.co.uk/2018/01/02/intel_cpu_design_fl... if you've not already picked up on the news.

In other words, sometimes competition is a race to the bottom. But then a bug like this tends to have a "reset" effect on everybody in the market.

I look at a statement like "Our competition is moving much faster than we are" as craven and lacking vision. At that point a wisened old Zen master type figure should've stepped forward.

Competition isn't about imitating the competitor anyway, is it? It's about differentiation, right? Maybe not. But it's not like you can't easily market literally any reasonable decision you make. Paul Masson wineries bragged about selling "no wine before its time" and turned their lack of "velocity" into marketing cachet. (Even though they weren't even unique in that regard.) There's no theoretical reason why Intel couldn't market itself as "the accurate chipmaker," keep on validating "lavishly"(1) and let AMD rush headlong into this kind of bug.

(1) Obviously not... but unfortunately you never know it's not enough validation until it's not enough validation.

The FDIV bug was the original Pentium floating point bug, right? Could this be the biggest blow to Intel since that one?
> Competition pressure could make a company's new product worse than (in this case, less stable than) their previous products

Well, that depends on the specific attributes on which there is competitive pressure. When its on time to market, yes, quality will suffer. When its on quality, products will be slower\more expensive,etc. Kind of similar to the often repeated quality triangle in Software dev

> Kind of similar to the often repeated quality triangle in Software dev

Which was disproved in practice.

Can you back this up? I googled and didn’t find anything relevant.
Yes I can.

https://www.amazon.fr/Economics-Software-Quality-Capers-Jone...

Not all good content is offered for free on the Internet.

Your terse and borderline patronizing answers aren’t helping to advance the conversation, though you do link to an important book.

Capers Jones’ work does not “disprove” the triangle, and doesn’t even mention it. If anything, he implies that quality is a more complicated subject than the simple triangle implies, in that lower quality will have non-linear negative impacts on the project cost, time, and scope.

This is the triangle: https://en.m.wikipedia.org/wiki/Project_management_triangle

The original triangle assumes that as a PM, if you don’t recognize the trade off among scope, time, and cost, quality will suffer. This is true even with Jones’ data, but it is trite. The practices that help to ensure quality that are outside the triangle’s intent as a guideline: it is necessary to recognize these trade offs, but not sufficient, to ensure on-time, on-budget, sufficient quality and scoped delivery. One could manage tradeoffs among the time/cost/scope variables and still screw up their product’s quality due to poor methods and practices. Which is obvious, when you think about it.

Jones’ book also has a number of flaws - it’s hard to put his recommendations in practice (its more a survey than a “how to”), and he lacks data on a number of effective , newer practices that involve both quality, improved velocity, and requirements gathering or product/market fit. The result is that he tends towards promoting older practices that help quality but don’t have much impact on whether you’re building the right thing in the first place. To be fair, he admits this throughout the book, but being a data guy, shrugs and moves on to discuss what works with the data he has. A classic case of “looking for your car keys under the street lamp”. Nevertheless, it illustrates why quality pays for itself, which makes it important.

But there is no one single such triangle.

Many people have in mind this version of the "quality triangle" which we were discussing (not a Project Management triangle which I agree is more on point).

You get that a lot if search for "quality triangle": https://www.google.fr/search?tbm=isch&q=quality+triangle

Why is it commonly assumed that quality and cost and time are opposed to each other?

The Capers Jones books have one idea throughout, that is backed by data: for Software, quality is positively correlated with reduced time-to-market, low defects, and reduced costs.

> The practices that help to ensure quality that are outside the triangle’s intent as a guideline

There is a chapter about that in EoSQ, where each and every practice are measured and classified by efficiency, like TDD and code reviews. It's very interesting.

> The result is that he tends towards promoting older practices that help quality but don’t have much impact on whether you’re building the right thing in the first place.

Yes. I find he puts way too much faith in "off the shelf" software as the last bullet we may have.

tl;dr Capers Jones disprove the "quality triangle" big time, which is a product of intuitive reasoning without any basis, which doesn't apply to software. Just look at Intel at this very moment.