|
|
|
|
|
by etm117
5747 days ago
|
|
The way I read it was the problem was corruption inside the database and the warm backup was corrupted during the automatic mirroring before they noticed the problem. So at that point, both the PROD and Failover instance are busted once the issue was determined. To resolve, it looks like they had to rollback to the last valid full DB backup from Sunday and then apply the log backups iteratively from Sunday to catch up the DB before bringing it back online. At my shop we had a similar issue (but at the SAN level, not the DB level) where the corruption issue was data that exposed a bug in the system. The data was automatically mirrored to the warm standby machine. When PROD crashed, the standby was brought up and immediately crashed also. We had to rebuild from tape backups which was stupid-slow (trademarked term there ;-). All in all it was a horrible mess that was root-caused to a bug in vendor firmware. Eerily similar to the JPMorgan Chase issue in the OP. |
|
I'll also hazard that they did it this way because some "enterprise architects" designed the system - no Oracle DBA would have done it like that for precisely those reasons.
NoSQL absolutely would not help in this case. If you are trading on the web you need the clickstream for the regulators, just like a bank tapes every phone conversation.