Hacker News new | ask | show | jobs
by jasonwatkinspdx 5749 days ago
You completely failed to read the article.

Otherwise you'd know that they had a fault that propagated to the hot spare. It's also utterly daft to think that a financial enterprise as large as JPM/Chase wouldn't already be running a HA setup. In this case it appears to be Oracle RAC.

I'm astounded how often I have to remind people that replication and backups are very different things, and that you need both.

I'm also depressed how many utterly thoughtless comments are made here on hackernews lately.

2 comments

No Oracle RAC shares the same storage between two or more nodes.

What they had here would appear to be database A running on storage A which is replicated at the storage level to storage B where database B waits in an idle state. Because the replication system is "blind" - it only sees its own filesystem containing bytes, not Oracle data structures - it can't tell a good Oracle block from a bad one and copies it.

I do this sort of setup for a living and you would be amazed at how many "architects" there are around who have completely drunk the storage vendor kool-aid and don't really understand how anything works (not even storage...).

This is likely the case since the post mentioned that storage controller was initially blamed (but cleared).
I rarely work with Oracle, so I'm not very familiar with the product line. Thanks for the correction.
I did read the article that was referenced. I did not read the article that that article referenced. My point was about the comment on "over engineering". This problem was not cause by over engineering.
It says right in the article that was referenced:

Before long, JPMorgan Chase DBAs realized that the Oracle database was corrupted in about 4 files, and the corruption was mirrored on the hot backup.