| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by barbegal 2846 days ago

I think it should be allowed but the tests it should pass must be far more strenuous than for traditional software. I'm happy with failure rates of around 1 catastrophic failure every million hours. Even humans sometimes fail catastrophically and black out at the wheel for no detectable reason.

That level of testing is well beyond what today's software and hardware is capable of. Waymo has to override their cars (disengagement) approximately every 6000 miles [1] which equates to about 200 hours of driving. To reach a confidence level of 1 in a million hours you would need a test fleet of a thousand vehicles operating for a whole year without any incident occurring that requires human intervention. The costs for such testing would run into the hundreds of millions of dollars which makes me feel like only the largest corporations in the world could develop this technology.

[1] https://www.dmv.ca.gov/portal/dmv/detail/vr/autonomous/disen...

2 comments

amelius 2846 days ago

And after certification, what if the company wants to push a quick update to all of its cars every now and then, through a remote update? Would that be allowed? How would we even know that it happens?

link

eitland 2845 days ago

> And after certification, what if the company wants to push a quick update to all of its cars every now and then, through a remote update?

Late and a bit rough but here is my idea:

Install redundant self driving units in at least a good number number of the first few thousand cars in each generation.

When planning a release, push to the redundant unit in the cars already running in "production".

Use only primary unit as input to car as usual, but log the diffs between the new version and old version in the same way they now log driver intervention.

I think there is a number of issues this won't catch, off the top of my head what if the new self driving unit attempts to turn slightly faster on slippery road etc.

But it should be able to collect up realistic feedback really fast i.e. in a few months (crazy slow for modern application developers like me but more than fast enough for anything that should be allowed to drive unsupervised I guess :-)

link

stallmanite 2845 days ago

I am fascinated by the testing methodology you described (current version and proposed upgrade run in parallel in production). I hope this isn't a dumb question but is there a name for that style of testing and do their exist categories of systems for which it is commonly used?

link

eitland 2845 days ago

> I hope this isn't a dumb question

Definitely not : )

> but is there a name for that style of testing and do their exist categories of systems for which it is commonly used

I don't know but I guess parallell deployment, production testing (hehe) or something.

I guess I learned about it here on HN but I've heard similar approaches elsewhere.

Three concrete systems I can think of that has been tested in similar but not identical ways:

- trajectory calculation systems for some space probe (I forgot the details) where two separate vendors where tasked with writing software and their software where then run in parallel in a simulation of possible trajectories to root out any bugs. Mentioned as an example to point out an extreme variant of testing, probably by someone here on HN or in something linked to from HN.

- a vendor running testing of bag sorting equipment on an airport. Probably told me from a colleague of mine at the time who knew them. AFAIK they'd rerun batches and verify the outputs, making sure the new system produced similar or better results.

- an API testing service, probably mentioned here on HN as well, tjat worked by firing three similar requests for each resource and method: two to a deployment of the old system and one to the new. The two that were fired against the old would be used to find parts of the response that changed from request to request (time stamps etc) and the the rest of the response would be used as a template to verify the response from the new system.

more generally capturing and reusing production input or running two sets of services in parallel in a data center, - or across datacenters: the existing system as usual and the system under test with only input data.

link

maxerickson 2846 days ago

Just make update transparency part of the initial certification.

Of course they might cheat, but who really knows if their car has an airbag where it is supposed to.

link

amelius 2846 days ago

> Just make update transparency part of the initial certification.

Somehow I doubt the authorities have the foresight.

link

toomuchtodo 2846 days ago

Even when they do, corporations will attempt to subvert process and policy "interlocks". See: Industry wide emissions cheating scandal.

link

TheSpiceIsLife 2846 days ago

They do that sort of thing occasionally, but on the whole, over time, cars have improved with regard to emissions and safety, mostly due to regulation.

link

nradov 2846 days ago

Perhaps, but it's going to be difficult to justify that logic to a jury in a wrongful death civil lawsuit.

link