Hacker News new | ask | show | jobs
by rodgerd 3941 days ago
> I've heard good things about Oracle's RAC, but it's understandably intolerant of your screwing up its disks (SAN mis/re-configuring) when you aren't properly maintaining backups

There are a number of problems with RAC, some of which are people using it wrong, and some of which are inherent to RAC. "Using it wrong" covers things like people not understanding it's on shared storage so it's providing compute node resilience, not storage resilience, so they probably sould spend on some Dataguard (or equivalent) unless they want to be the DBA equivalent of the server admin who thinks you don't need backup because you've got RAID.

The built-in problems come from the fact Oracle ASM doesn't check[1] the signatures on disks/LUNS presented to it. So if the SAN admin, I don't know, manages to somehow reverse the mappings for one LUN of 30 between the stress RAC and the dev RAC, Oracle will not start and say "that ASM disk has the stress signature on it"; Oracle will overwrite the stress LUN with dev data for a while, then go to read it, then discover it doesn't have the on-disk structure it expects, then crash with a SEGV or other entertaining but unhelpful error. But only after it's irretrvably corrupted the ASM group, of course.

[1] as of 10g, the last time I hit this problem.