The system crashed while my coworker was running a code (aka doing CPR) in the ER last night. Healthcare IT is so bad at baseline that we are somewhat prepared for an outage while resuscitating a critical patient.
The second largest hospital group in Nashville experienced a ransomware attack about two months ago. Nurses told me they were using manual processes for three weeks.
It takes a certain type of a criminal a55hole to attack hospitals and blackmail them. I would easily support life or death penalty for anyone attempting this cr@p.
Yes. And I was told by multiple nurses at St. Thomas Midtown that the hospital did not have manual procedures already in place. In their press release they refer to their hospitals as "ministries" [0], so apparently they practice faith-based cyber security (as in "we believe that we don't need backups") since it took over 3 weeks to recover.
As a paramedic, there is very little about running a code that requires IT. You have the crash cart, so not even stuck trying to get meds out of the Pyxis. The biggest challenge is charting / scribing the encounter.
I used to work in healthcare IT. Running a code is not always only CPR.
Different medications may be pushed (injected into the patient) to help stabilize them. These medications are recorded via a bar code and added to the patients chart in Epic. Epic is the source of truth for the current state of the patient. So if that is suddenly unavailable that is a big problem.
Okay,not having historical data avaliable to make decision on what to put into a patient is understandable - but maybe also print critical stuff per patient once a day? - but not being able to log an action in realtime should not be a critical problem.
It is a critical problem if your entire record of life-saving drugs you've given them in the past 24 hours suddenly goes down. You have to start relying on people's memories, and it's made worse by shift turn-overs so the relevant information may not even be reachable once the previous shift has gone home.
There are plenty of drugs that can only be given in certain quantities over a certain period of time, and if you go beyond that, it makes the patient worse not better. Similarly there are plenty of bad drug interactions where whether you take a given course of action now is directly dependent on which drugs that patient has already been given. And of course you need to monitor the patient's progress over time to know if the treatments have been working and how to adjust them, so if you suddenly lose the record of all dosages given and all records of their vital signs, you've lost all the information you need to treat them well. Imagine being dropped off in the middle of nowhere, randomly, without a GPS.
That's why there's a sharpie in the first aid kit. If you're out of stuff to write on you can just write on the patient.
More seriously, we need better purpose build medical computing equipment, that runs on it's own OS, and only has outbound network connectivity for updating other systems.
I also think of things like the old school "check list boards" that used to be literally built into the yolk of the airplane they were made for.
I’m afraid the profitability calculation shifted it in favor of off-the-shelf OS a long time ago. I agree with you, though, that a general purpose OS has way too much crap that isn’t needed in a situation like this.
> It is a critical problem if your entire record of life-saving drugs you've given them in the past 24 hours suddenly goes down.
Will outages like this motivate a backup paper process? The automated process should save enough information on paper so a switch over to paper process at any time is feasible. Similar to elections.
Maybe if all the profit seeking entities were removed from healthcare that money could instead go to the development of useful offline systems.
Maybe a handheld device for scanning in drugs or entering procedure information that stores the data locally which can then be synced with a larger device with more storage somewhere that is also 100% local and immutable which then can sync to online systems if that is needed.
Many place did revert back to paper processes. But, it’s a disaster model that has to tested to make sure everyone can still function when your EMR goes down. Situations like this just reinforce that you can’t plan for if IT systems go down, it is when they go down.
I don't think it is historical data required to make a decision, it is required to store the action for historical purposes in the future. This is ultimately to bill you and to track that a doctor isn't stealing medication, improperly treating the patient, and to track it for legal purposes.
Some hospitals require you to input this in order to even get physical access to the medications.
Although a crash cart would normally have common things necessary to save someone in an emergency, so I would think that if someone was truly dying they could get them what they needed. But of course there are going to be exceptions and a system being down will only make the process harder.
Of course the real backup plan should be designed based on the actual needs, perhaps the whole system needs an "offline mode" switch. I assume they already run things locally, in case the big cable seeker machine arrives in the neighborhood.
Most printers in these facilities run standalone on an embedded Linux variant.They actually can host whole folders of.data for reproduction "offline". Actually all scan/print/fax multi function machines can generally do that these days. If the IT onsite is good though the usb ports an storage on devices should be locked down.
Oh yes. This would be a contingency measure, just to keep the record in a human readable form while requiring little manual labor. Printed codes could be scanned later into Epic and, if you need to transfer the patient, tear the paper and send it with them.
It is not necessarily crowdstrike's responsibility, but it should be someone's.
If I go to Home Depot to buy rope for belaying at my rock climbing center and someone falls, breaks the rope and dies, then I am on the hook for manslaughter.
Not the rope manufacturer, who clearly labeled the packaging with "do not use in situations where safety can be endangered". Not the retailer, who left it in the packaging with the warning, and made no claim that it was suitable for a climbing safety line. But me, who used a product in a situation where it was unsuitable.
If I instead go to Sterling Rope and the same thing happens, fault is much more complicated, but if someone there was sufficiently negligent they could be liable for manslaughter.
In practice, to convict of manslaughter, you would need to show an individual was negligant. However, our entire industry is bad at our job, so no individual involved failed to perform their duties to a "reasonable" standard.
Software engineering is going to follow the path that all other disciplines of meatspace engineering did. We are going to kill a lot of people; and every so often, enough people will die that we add some basic rules for safety critical software, until eventually, this type of failure occuring without gross negligence becomes nearly unthinkable.
Its on whoever runs the hospitals computer systems - allowing a ring 0 kernel driver to update ad-hoc from the internet is just sheer negligence.
Then again, the management that put this in are probably also the same idiots that insist on a 7 day lead time CAB process to update a typo on a brochure ware website "because risk".
This patient is dead. They would not have been if the computer system was up. It was down because of CrowdStrike. CrowdStrike had a duty of care to ensure they didn't fuck over their client's systems.
I'm not even beyond two degrees of seperation here. I don't think a court'll have trouble navigating it.
I don't think you understand the scale of this problem. Computers were not up to print from. Our Epic cluster was down for placing and receiving orders. Our lab was down and unable to process bloodwork - should we bring out the mortar and pestle and start doing medicine the old fashioned way? Should we be charged with "criminal negligence" for not having a jar of leeches on hand for when all else fails?
I was advocating for a paper fall back. That means that WHILE the computers are running, you must create a paper record, eg “medication x administered at time y”, etc., hence the receipt printers, which are cheap and low-dependency.
The grandparent indicated that the problem was that when all tow computers went down, they couldn’t look up what had already been done for the patient. I suggested a simple solution for that - receipt printers.
After the computers fail you tape the receipt to the wall and fall pack to pen and paper until the computers come back up.
I completely understand the scale of the outage today. I am saying that it was a stupid decision and possibly criminally negligent to make a life critical process dependent on the availability of a distributed IT application not specifically designed for life critical availability. I strongly stand by that POV.
For relying on windows to run this kind of stuff and not doing any kind of staged rollout but just blindly applying untested kernel driver 3rd party patching fleet wide? yeah honestly. We had safer rollouts for cat videos than y'all seem to have for life critical systems. Maybe some criminal liability would make y'all care about reliability a bit more.
A QR code can store 3 KB of data. Every patient has a small QR Sticker printer on their bed. Whenever EPIC updates, print a new small QR sticker. Patient being moved tear of sticker and stick to their wrist tag.
This much of patients state will be carried on their wrist. Maybe for complex cases you need two stickers. Have to be judicious in encoding data, maybe just last 48 hours.
Handheld qr readers, off line that read and display QR data strings.
You need to document everything during a code arrest. All interventions, vitals and other pertinent information must be logged for various reasons. Paper and pen work but they are very difficult to audit and/or keep track of. Electronic reporting is the standard and deviating from the standard is generally a recipe for a myriad of problems.
We chart all codes on paper first and then transfer to computer when it's done. There's a nurse whose entire job is to stay in one place and document times while the rest of us work. You don't make the documenter do anything else because it's a lot of work.
And that's in the OR, where vitals are automatically captured. There just aren't enough computers to do real-time electronic documentation, and even if there were there wouldn't be enough space.
I chart codes on my EPCR, in the PT's house, almost everyday with one hand. Not joking about the one hand either.
Its easier, faster, and more accurate than writing in my experience. We have a page solely dedicated to codes and the most common interventions. Got IO? I press a button and its documented with timestamp. Pushing EPI, button press with timestamp. Dropping an I-Gel or Intubating, button press... you get the idea.
The details of the interventions can be documented later along with the narrative, but the bulk of the work was captured real-time. We can also sync with our monitors and show depth of compressions, rate of compressions and rhythms associated with the continuous chest compression style CPR we do for my agency.
Going back to paper for codes would be ludicrous for my department. The data would be shit for a start. Hand writing is often shit and made worse under the stress of screaming bystanders. Depending on whether we achieved ROSC or not would increase the likelihood of losing paper in the shuffle
The idea is to have the current system create a backup paper trail from which you practice resuming from for when computers go down. Nothing about current process for you need change only that you be familiar with falling back to paper backups when computers are down.
Which means that you have to be operating papered before the system goes down. If you aren't, the system never gets to transition because it just got CrowdStruck.
You can do CPR without a computer system, but changing systems in the middle of resuscitation where a delay of seconds can mean the difference between survival and death is absolutely not ideal. CPR in the hospital is a coordinated team response and if one person can’t do their job without a computer then the whole thing breaks down.
If you're so close to death that you're depending on a few seconds give or take, you're in God's hands. I would not blame or credit anyone or any system for the outcome, either way.
Judgement is always part of the process, but yeah running a routine code is pretty easy to train for. It's one of the easiest procedures in medicine. There are a small number of things that can go wrong that cause quick death, and for each a small number of ways to fix them. You can learn all that in a 150 hour EMT class.
Hello, I'm a journalist looking to reach people impacted by the outage and wondering if you could kindly connect with your ER colleague. My email is sarah.needleman@wsj.com. Thanks!
I mean if they're finding sources through the comment and then corroborating their stories via actual interviews, it's completely fine practice. As long as what's printed is corroborated and cross-referenced I don't see a problem.
If they go and publish "According to hackernews user davycro ..." _then_ there's a problem.