|
|
|
|
|
by gaius
5748 days ago
|
|
I'm guessing they were using the storage to do the replication, rather than DataGuard to replicate and RMAN to make the initial copy, which checksums the blocks on the way - it'll tell you off the bat if you have any block-level corruption, there's no way for the storage to do this because it can't tell a valid Oracle block from any other sort of block. Because DataGuard is Oracle-aware, you always have a valid standby - if the primary datafiles are corrupt, you can still ship the redo logs (which you will be multiplexing too). I'll also hazard that they did it this way because some "enterprise architects" designed the system - no Oracle DBA would have done it like that for precisely those reasons. NoSQL absolutely would not help in this case. If you are trading on the web you need the clickstream for the regulators, just like a bank tapes every phone conversation. |
|
Large modern websites store tons of information about a user which may not in any way be necessary to even keep for anything other than data mining, or perhaps preferences, click/hit tracking, etc. I can't see how such data is important in any way in regards to finances or trades, so why it couldn't be done on a much-less-resource-intense database solution I don't understand.
Moreover, the cascading effect of a database failure is made much worse by putting all your eggies in one basket and depending on this one cluster of databases to keep the whole ship afloat. In a good design scenario, much of the site should still keep operating even if the backend databases are timing out from load. For example, your cache layer (if not expired) should continue serving cached content/logins/etc. This may not be as useful for clients that sign in randomly or throughout the day, but for people who use the site frequently or stay logged in throughout the day their sessions should stay active in this scenario.
The content in the user profile which doesn't require ACID compliance could also be using caching and nosql/mysql/etc which would keep the apps working even longer in the event of an outage of a particular piece of technology. Because this technology doesn't require some of the more complicated requirements of Oracle RAC it may also be easier to recover/restore old data, again assuming this doesn't have a particular need for ACID.