Hacker News new | ask | show | jobs
by cybrjoe 2175 days ago
Serious question, how is this different from what VW did? There was a lot of talk about complicit engineers vs. shady management in that discussion. Did this ever make it to production?
6 comments

VW actively detected if the car was on a treadmill testing setup and electronically changed the engine parameters to make it run cleaner.

The water heater was designed with the test in mind but functions the same regardless.

VW was clearly cheating whereas the water heater is taking advantage of known deficiencies in the test (is that cheating?).

I'm not sure that's an accurate description of VW's cheat.

VW knew the parameters of the test (stationary car, no steering input, prescribed throttle inputs, etc). And they configured the car to pass the test.

Water-heater engineer knew the parameters of the test (number and location of probes). And he configure the water-heater to pass the test.

Same-same. In both cases, an engineering team willfully committed fraud to improve their sales figures.

The only differences are a bit pedantic. VW's "configuration" was more elaborate. But, both groups gamed the test.

You might misunderstand what VW did.

Apparently it is common for cars to have a "test mode" bit, because they run on a dyno (only one set of wheels spin), and the car may disable certain traction control systems, etc.

VW changed the way other systems (the engine itself) function in this mode. So even if everything is exactly the same on the road (e.g. 30mph in a straight line for hours), the car will perform differently and have different emissions. You don't drive around on a dyno.

Optimizing for the test may go up to the line, but VW crossed it.

The differences are not pedantic at all. The differences are extremely significant.

The VW engineers designed the car to do different thing while being tested vs used by consumers. While being tested, the emissions system was on. While being used by consumers, the emission system was entirely disabled.

The water heater engineers designed the water heater to do the same thing while being tested vs used by consumers.

If the water heater engineers designed the water heater to never turn on and keep the water at room temperature while being tested, ("works good boss, water heater efficiency is infinity percent") then yes, it would be reasonable to claim they did essentially the same thing.

> If the water heater engineers designed the water heater to never turn on and keep the water at room temperature while being tested

I think the difference between what they did and this is only a difference in degree and not kind though, right?

It is a difference in kind: VW dynamically detected and adapted behavior to the test. It would never operate in that way under normal conditions. The water heater example was completely static: it always behaved the same way, under test or not.
People obviously disagree since I was downvoted but I don't really see the black-and-white difference between the two.

In both cases there was an intentional design change to deliberately changed mislead a measurement, and the final product does not match the intended thing. Both adversarially directly cause the consumer's purchase to not be the intended item; no one gets a car that has emissions measured and no one gets a heater that has the efficiency measured.

One added difference is that the water heater. The intent may be deceptive but they provided what was asked for and what they claimed. It is almost malicious compliance. Dickish gaming and probably worth suing over but not sure if it is criminal fraud per se.

It is like the joke about Soviet factory metrics for some item like nails. They set the quota on numbers and got useless ones better described as needles. They wised up a bit and set the quota by weight next month and got one useless massive nail instead. It isn't even too far from the truth given actual management involved things like a train line circularly shipping coal between depots instead of retrieving from source or distributing to end users to boost their "metric tons transported kilometers" metric.

VW meanwhile had a covert illegal configuration while claiming mileage and lack of maintenancd urea. The claims are outright false - that it provides all three benefits instead of two of three.

That kind of mentality was the joke in many eastern eu movies in the 80s. Pretty much the standard comedy in Polish movies from that time.
It's not the same at all, and I don't think the differences are pedantic.

The water heater runs the same regardless of whether it's being tested or is running normally in someone's home. Really, this test "cheat" exposed that the test itself was not measuring what it thought it was measuring; and hell, a manufacturer could accidentally cheat with their design.

The car would run differently if it detected it was in a test situation. In the real world, with a real driver, it would run in a way that would give different test results (if you were in a position to run the test while it's being driven).

What was more heinous about the VW case IIRC is that the cleanliness gains the cars made on the treadmill setup were enough to push them into compliance with regulations, which the engines normally didn't meet.

"Cheating" a better efficiency rating is significantly less reprehensible than cheating a legal obligation which was legislated for public health and environmental reasons.

But the efficiency ratings also exist for public health and environmental reasons. Why does it matter to you that one is a mandate? They're both intentionally deceiving regulators and consumers (vs accidentally acing the test without cheating). Both are fraud.
I do think that manipulating a purely instructive measure is less extreme than manipulating a compliance test; consumers can seek alternate tests and reviews, but the state emissions test has special status even if a dozen other tests give a different result. That said, I believe Energy Star ratings affect tax rebates and electric bills, and they're required to be printed on products - so that's not really an arbitrary test.

There are other differences here too, I think. The water heater trick is passive manipulation that stays in place at all times, which limits how far from "real" performance it can get. And per the story, it seems more like "teaching to the test" than "cheating". That is, Volkswagen consciously moved away from the mandate outside of testing. The water heater was (potentially) as energy-efficient as they could design, with the test score manipulated on top of that.

None of that makes it harmless - if "as good as you can make" doesn't hit standards without manipulating them, that's still a problem. But I do find it less galling than "intentionally worsens emissions outside the test bench".

The flip side of the water heater test is, you could game the test the other way too. Making your water heater look worse than it is. Would you do that? No.

The difference between the water heater and VW is the water heater manufacturer is providing a representative sample. And VW was not. It'd also be dubious to say that the water heater company is acting in bad faith. Where VW's bad faith rose to the level of criminal. On the other hand Volvo appears to be acting in strictly good faith.

Bad faith for a crash test would be crafting a silver plate model for testing. Reminds me that's what my uncle said the power supply manufacturer he worked for did.

The difference is that the water heater test itself was flawed in that it depended on arbitrary design decisions that have nothing to do with efficiency. Two completely innocent manufacturers could build water heaters with the exact same real-world efficiency, but score fairly differently on the efficiency ratings just due to how they're designed.

While I agree that this particular water heater manufacturer was doing something shady in order to get the best score, at least they weren't selling a product that did something differently while under test conditions vs. in real-world usage. They merely realized that the test itself had wide error bars, and designed their heater to "err" in the positive side of those.

VW, in contrast, sold a product that lied to the testers about its emissions in order to pass certifications, while in real-world driving would behave in a way that would not pass muster.

And to me I think that's the key: VW's cars intentionally behaved differently depending on if they were being tested or if they were being driven in normal real-world usage. This water heater behaved the same regardless of whether it was being tested or was heating water in someone's home.

In a way I think of this in academic terms. The water heater manufacturer studied the SAT to learn what kind of questions were going to be asked. VW stole the answer key to the test and memorized it.

One is polluting at the tailpipe, the other is only polluting at a power plant.
Many important differences.

1. Actively detecting test and behaving differently. It's like stealing a test vs teaching to the test.

2. Lower stakes. Health issues are much more serious than inefficiency.

3. It affects the buyer. It's more acceptable for the buyer to be cheated than everyone around them.

4. People could have created these layers by accident. Favouring those who got lucky is unfair.

Honestly I think basically all my gadgets exaggerate how energy efficient they are, by tuning parameters for tests that don't correspond to the real world. My dishwasher has an energy efficient mode, the manual literally says it's just for compliance and recommends other modes. It's just a fact of life.

This is omnipresent even where regulators aren't involved: every graphics card benchmark out there is 'manipulated' relative to real world performance. At this point it's so universal that I don't think anyone is even fighting it - as long as everyone games benchmarks roughly the same amount, the relative scores stay usable.

Your point about fairness and passive design is the one that makes me view these cases differently also. In the anecdote, the product being tested was the same one being sold, and there's no sign the heater was worsened to improve test performance. The designers just picked the best-scoring option among some reasonable configurations. (Frankly, once they noticed that issue, what were they supposed to do? Pick the worst-scoring, or pick the spec out of a hat?)

In the VW story, the test-bench vehicle was fundamentally different from the market vehicle, and the road version was designed to behave worse on the metrics to get other gains. I happen to know someone who bought a diesel Jetta specifically because it was more eco-friendly than other options, and I think he'd draw a clear line between tuning for test metrics and VW consciously lying to their buyers.

It's interesting you mention graphics cards because that very behavior has lead to the gaming community favoring benchmarks derived from a handful of current gen games min/max/avg FPS over so called synthetic benchmarks. It only took a handful of instances of companies baking in "benchmark" modes that get triggered when certain benchmarks are detected for people to start discounting those benchmarks in favor of more organic measurements.
> Honestly I think basically all my gadgets exaggerate how energy efficient they are, by tuning parameters for tests that don't correspond to the real world.

Having measured all my gadgets with a Kill a Watt meter, that's not my experience. It seems that many gadget-makers realize that people don't really care about power draw, so they just slap the maximum draw onto the specs.

The big difference with VW is that they put in a mechanism that detects the test and completely changed the behavior of the car for the test.
So they _intentionally_ set themselves up to lie to regulatory agencies and consumers about real world efficiency. That honestly sounds basically the same to me. In both cases the tests are poor approximations, and in both cases someone could accidentally optimize the test, and in both cases someone did it intentionally to deceive people.
Imagine the water heater company didn't know they were gaming the test. They first design a water heater and it gets a B- on the test. Being overachievers, they work hard and submit a second design that gets an A+. They might not realize that both heaters are basically the same with the only difference being some heating element spacing that works better for the test. Both times they submitted a legit design that was the same they'd provide to consumers. Sure, we know the engineers knew what was happening, but we can see how one might innocently arrive in the same scenario. I think it's safe to say the test is flawed.

The VW test is not like that. There's no way to innocently arrive in the scenario they did. They did not game a bad test, they literally lied to the test administrators. The car ran in "clean mode" only if it was in the test environment. If the car ran like it did on the road, they'd have failed (which is how they were caught, with a mobile testing setup).

One of the points in the article is that regulating for safety based on known testing conditions is going to result in over-fitting for the test. The water heater company is guilty of intentionally over-fitting. VW just straight up lied. I don't think those 2 actions are equal, VW is worse, but I agree that both are dishonest to a degree.

> Imagine the water heater company didn't know they were gaming the test.

We don't have to. In this case we have clear admission of intent. The intent to deceive is what makes it fraud and not just being wrong.

My point isn't that what the water heater designer did was perfectly ethical, just that it's clearly distinct from what VW did.
It's distinct in technical detail, but not in the broken ethics rule against intent to deceive.
There's no intent to deceive necessary in the water heater example. The water heater company could have sent it in with a note to the regulator saying "we moved the second element up, because we believe it will perform better on your test" and the regulator would likely just accept it instead of redesigning the test.

Also, for the water heating one, there's a plausible reason for the regulator to care about the discrete measurements rather than the total amount of thermal energy in the water. Hot water at the top of the tank is more valuable, because it's used first and less likely to be wasted, so you could wait it more heavily in a test. There's no parallel for the VW test cheating. No indication that's what happened here, of course.

There is no deception and no ethical issue. The nature of the test is known.

What placement of components would be ethical? Should engineers required to be separated from the test parameters by a Chinese wall? Do they need to build the system for the worst result? Some middle ground? If the engineers are unethical, where is the line?

The obvious answer to optimizations like this are for the testing body to tweak the test procedure based on what manufacturers do over time. That provides an incentive to be more conservative or accurate.

The key legal difference is that VW literally behaved differently if a test was running. If they had simply designed a system that tested better than it actually performed (by optimizing the factors tested for), they would not have gotten in trouble.

If the water heater manufacturer had special heating elements that only ran during the test, it would be equivalent.

> in both cases someone could accidentally optimize the test

I think this is what I disagree with.

The water heater story is about a viable-for-market design which also optimized for the test. The equivalent for a car emissions test might be optimizing the transmission to reduce emissions at the specific speeds which will be tested. Those speeds could be sweet spots of the engine curve by accident, or they could be planned that way. I don't think that's necessarily right, but it's within the bounds of "natural" design for the product.

Instead of doing that, VW submitted something for testing which was fundamentally different from what went to market. Rather than being misleading, the test results were fundamentally irrelevant. Creating two completely different modes of behavior isn't something you could do by chance, and it means there's no real limit on how badly they could cheat.

There is a difference between memorizing enough information to ace a test and sneaking in notes that aren't allowed to ace a test. And people would also say it is wrong to steal a copy of a test and then memorize the answers to it to ace a test. But what if a professor uses the same test every year (maybe changing a few numbers but in a way that only impacts the calculations, not the way to solve it) and people study just the information needed to answer the test. Is that cheating?
If you cheat in most such tests it just means you miss out on actually learning what you were supposed to. If it wasn't your intention to learn anyway I guess that's fine.

Rarely the purpose of tests is to assure the public of your fitness (e.g. a driving test) and cheating those might be a problem, but if you cheat my CS 101 course and then struggle because you needed remedial classes but the cheated test means you don't get them that's your problem.

Another aspect is the incentives. Most discussion here is about the cheating itself, and not the reasons for it. I may not learn much from just writing about a degree I don't really have on my resume, or roles I never worked at, and experience I don't have. But I can get paid a lot more by doing so.
There are a few other issues with cheating, such as devaluing a degree for all others who didn't cheat to earn it.
Morally, they both seem to fall in the same category. Legally, it might be a complicated question.
That sounds exactly like the third sentence of the article.

> Sun managed to increase its score on 179.art (a sub-benchmark of specfp) by 12x with a compiler tweak that essentially re-wrote the benchmark kernel.

Yes, but you're talking about what VW did vs what Sun did, but the person you're replying to is talking about what VW vs what a company that makes a water heater does.

I agree that what Sun did is very similar to what VW did, with the exception that VW's increased emissions (statistically speaking) killed people, and what Sun did likely had no health impact on anybody except a few hurt paychecks.

Sort of, except that emissions testing is a regulatory requirement.
VW's vehicles did not meet the standards required when used every day. They only met the standard during testing.

The water heaters don't sound like they'd fail any given test.

With the following question, I'm not absolving VW from criticism. With that in mind:

Why are we not holding those doing the measurement accountable as well?

If you produce a test that can be gamed and your job is to test things to meet consumer expectations, you've failed at your job.

After all is said and done, what is a better outcome: a) VW is punished for gaming the test b) the test is significantly harder to game

With (a), we have only one less manufacturer gaming the tests, VW. With (b) we have tests that none of the manufacturers can game any longer or at least will take time to game. The testers should be expected to always be two steps ahead.

This is not unlike whitehat/blackhat security engineering. We should pay bug bounties to teams that successfully exploit the tests and we should be actively running red team drills.

https://en.wikipedia.org/wiki/Red_team

What VW did is similar to uber's greyball system in that they give a different experience to the regulator rather than giving everyone an experience that is tuned to what a regulator might hope to see