Hacker News new | ask | show | jobs
by gregschlom 2997 days ago
I don't see anyone here commenting on this, but there's an interesting software / UI angle in this story: the captain preferred the fancier, animated-map-style weather reports from a 3rd-party company (B.V.S. reports) to the more terse text-only reports from the National Hurricane Center (sat-C reports).

Turns out, the B.V.S reports were using raw data that was 10 to 12 hours old - and they didn't explicitly mention that. In the case of a rapidly evolving hurricane, it mattered a lot.

> The B.V.S. map included a time stamp that showed when the processing had been completed, but gave no indication of the age of the raw data on which the forecast was based. Davidson knew that all the forecasts were uncertain, and that they sometimes disagreed. But how aware was he that when he looked at the B.V.S. maps he was looking into the past?

[...]

> Davidson dismissed the plan with a thank-you and did not come to the bridge. Evidence suggests that he was still showing a preference for the animated B.V.S. graphics, which indicated the storm progressing more slowly.

6 comments

The bit you skipped:

> He went down to his stateroom after his conversation with Schultz, and when he returned to the bridge he said, “All right, I just sent up the latest weather. Let us clear everything off the chart table with the exception of the charts.” Schultz opened the B.V.S. program. As it happened, according to the N.T.S.B. report, because of a software glitch, the map that appeared was the very same map that had come in with the previous download, six hours earlier. The raw data on which it was based was at least 12 hours old.

It sounds like it wouldn't actually have helped if the captain had the per-hour update, because this undetailed software glitch meant the device wasn't showing the updated maps anyway.

The apparent moral of that part of the story is that the B.V.S. charts are past useless and into actively harmful territory, since they may not just be old but instead actively incorrect.

Sounds like the mapping software should put a big red timestamp in the middle of the map if the current data is over 1 hour old.

It's unfortunate when programmers overlook something this big.

Yes, I saw the same thing. Even when I've had people reviewing logs, I make the date prominent and train them to check it first. BVS's data is far more critical.

(I learned from a burnt hand: We engaged a non-technical but reliable user to check the daily backup log for errors and report any to us. One day we needed the backup and discovered it hadn't run at all for many weeks. Ouch. I asked the user; she said she indeed checked the logs daily and they were fine. She was right: She was seeing the log from the last backup, unchanged every day. My fault entirely, not hers: We should have anticipated the date problem, and we should have utilized someone technically literate enough to understand what they were reading - in this case, someone who would recognize an 'obvious' problem such as the numbers of files and bytes not changing. And we should have tested our backups more often, but that old lesson almost isn't worth mentioning.)

Not that it will comfort you much but you are hardly alone in this. Bad backups, logging on the same servers as where access takes place and single points of failure in personnel are some of the most frequently occurring things I come across in my 'day job'.
Been there, done that.

I recently helped a company big enough that everyone here would recognize the name fix a lot of items around monitoring and logging after finding they were running an important production system in such a manner as to be essentially flying blind. Yeah, those fatal errors in the logs just might be important...

Thanks, and I know it well; that was an early lesson. Backups in particular are an amazing cesspool of problems for something so conceptually simple.
That's the thing that always bugs me, the vast majority of the items that I end up with on the todo list after a review would cost $0 or very little to get right.

Super frustrating. And you can't even rely on things staying fixed either, you have to review periodically or it will be back to square #1 within the year.

> the vast majority of the items that I end up with on the todo list after a review would cost $0 or very little to get right

Agreed, and I drive people crazy with my focus on those things. Thorough design and implementation (including testing) up front cost far less than correcting problems later, and they don't add the enormous cost of downtime and other failures.

But ... I've found that human beings, even serious professionals, have a capacity limit for details, and it's not very high; and if it's for an over-the-horizon risk, attention is very limited. That is my biggest constraint, editing down the details, organizing them, automating them, and making trade-offs to reduce them to a point where others don't throw up their hands. Also, it's hard to get the budget for that up front investment in what looks to others like obsessiveness (it's not; it's carefully considered ROI).

So when you show up for your review (I don't know exactly what you do, but I have an impression), 1,000 details might have been addressed but 50 overlooked. or 1,050 details might have been implemented but there was no capacity for the next 100 - resources ran out, something else came up, etc.

So I can see it both ways.

Good stuff, thank you, I can see there might be some way to get a process in place to avoid these relapses.
Also note that B.V.S did appear to have a 'Hourly Update' feature that did not appear to be subscribed to.

It's also important to realize that the captain was confident with the plan- except the storm in question was abnormal. That's an important fact to consider, as it throws the mariners heuristics off deciding how much more attention to pay.

> Also note that B.V.S did appear to have a 'Hourly Update' feature that did not appear to be subscribed to.

From the article, "The Coast Guard report also noted that “El Faro crew did not take advantage of B.V.S.’s tropical update feature,” which would have provided hourly updates."

So hourly updates apparently were available even if not subscribed to.

The crew did not take advantage, or the company did not buy the subscription?
The NTSB report says that it was the former, they were available but the captain had not configured the software settings to receive the hourly reports.

(Obviously, and hindsight is 20-20, hourly should've been the default).

There’s an interesting ethics question here: If your ship’s location places you near a hurricane, should they just quietly upgrade you for free tenporarily so that you don’t sink?
They should at least place a large warning stating that the data is not safe to use for navigational purposes.
I'm sort of surprised that data from a source like that is used. Have you ever seen the weather data that airline captains use? It looks like this:

KORD 050251Z 26006KT 10SM SCT110 M02/M09 A3019 RMK AO2 SLP232 T10171089 53009

That's the current weather conditions for Chicago O'Hare.

Looks like gibberish to you and me but airline captains know what it means and they know how old the observation is. Captain has ultimate authority on the ship, but also ultimate responsibility. It was his job to know what that weather report meant. And surprising to me if he didn't, since by other accounts he was experienced, organized, and safety-conscious.

You can't get licensed as a pilot without being able to read METARS. They will not let you fly. Find me a comparable licensure requirement for marine captains.
Funnily enough I actually look at the KORD weather reports all the time and took the time to learn how to read pilot reports.

I think the difference is that captains are dealing with currents in the ocean and in the air. For the most part pilots are flying above things that will make the wind change in extremely short distances.

Interesting! Did some searching and found this[1] which outlines the parts of a reading.

[1]: https://www.wunderground.com/metarFAQ.asp

If you did this warnings, you would want to be on the safe side. So, there would be a lot of false positives. But, as a result, your customers would get accustomed to it and don't treat them seriously.

There's no easy fixes here.

How about placing a clear date/time stamp on the chart?
Didn't Tesla do that for cars near evacuation zones?
Is that really a UI issue or misrepresented data?

If I see a timestamp on rapidly changing data I sure as hell would like to see if there's a delayed effective stamp (most stock market tickers indicate the "effective date" of their data or say clearly there's a XX min delay).

> Is that really a UI issue?

More of a UX issue. The worst possible kind.

Reminded me of this aircraft accident where the time delay of radar satellite datalink weather information was critical. The time delay there is minutes not hours.

https://www.youtube.com/watch?v=83uvKWJS2os

An aviation accident case study pinpointed a similar issue with in-cockpit weather displays. In this case the delay was less than 10 minutes, but it was enough to be fatal: https://youtu.be/83uvKWJS2os