Hacker News new | ask | show | jobs
by junkcollector 3061 days ago
I would argue that air gaps are critical in ICS networks and PLCs because it is an extreme safety risk to push a potentially bad update from offsite without clearing the operator. The major issue will be pushing back against cost pressure from management.

This is already turning into a major problem with IOT devices from supposedly reputable companies i.e. home thermostats that let pipes burst because of a bad or incorrectly pushed update. Now consider if that bad update could instead kill someone.

2 comments

The operator should never be in harms way period. If they are working on equipment it should be physically locked out, say a valve would have a 6" diameter pin through the mechanism and a padlock on it so the pin can't be removed except by the operator, and electrical equipment is powered off and the breaker is locked in the off position. If they put themselves in harms way they will eventually be fired for unsafe work.

bad updates are bad updates. they are incompetence on the part of the programmer. but the worst they should be able to cause is minor equipment malfunction or the physical system has been poorly designed.

I think you misunderstand. When maintenance is being done on the machine itself, yes you are going to do a lock-out tag-out routine. On the flipside, if you remote push the update and it contains for instance, a register read error on the rate controller that shifts the bits by 1, the operator's first notice us going to be when he loads it up and the ensuing chatter violently throws the work piece.

Or something dumber, maybe they pushed a temp variable to the wrong type of memory (say flash eeprom) that updates every cycle of the control loop, and then did a push to an entire line of manufacturing equipment. Congratulations, all of those are going to brick in an hour or so when the memory hits it's maximum write cycle lifetime and you'll have to get new boards shipped in while your entire line is down.

Remote pushing needs to be handled very carefully when controlling real world equipment.

My experience is limited controlling turbines, generators, pumps, valves, hydraulics, and dams using Schneider, Allen Bradley, and Unitronics PLCs. No motion controllers. no conveyor belts, no factories, no robots, no VFDs.

I have made a total of at least 7,500 remote updates over maybe 25 different control systems. In fact we basically get it working well enough the operators can handle day to day stuff and then go remote after that, because who wants to stay away from home at some dirty industrial site eating crappy food. The exception to that is backup power for hospitals, no remote access there and it is a simple enough system we can test all the different scenarios and then walk away.

Generally the last thing I put before an output in a PLC is some rate limiting. If it is a discrete output it won't be allowed to operate more than once every 5s for example. If it is an analog output it is limited to achieve the maximum desired actuator velocity or acceleration. This is a good catch all to avoid damaging equipment. I watched somebody learn this the hard way as a DC motor starter exploded when it was told to start and stop the motor 10 times a second.

Certainly I have made errors. A bad one was I forgot to limit a position so that it could not be less than 0. The position was subtracted from some other number. Substracting a negative number is adding! That was a nasty positive feedback loop that resulted in fast oscillations in the position of a 1m diameter pressure reducing valve. However that was during on site commissioning not remote.

Certainly for major changes I will require a shutdown and co-ordinate with the operators, but it is a judgement call on my part as to what I can program and test at the office and unleash on remote equipment vs. what needs to be tested on the actual equipment. Most testing on the equipment is required to determine the equipment characteristic.

A large amount of this sounds like just awareness, authorization and authentication issues with updates. The plant should be aware and know what changes are being made. As well as be able to roll them back.