With that much gear and those kind of loads do you still have a traditional UPS / transfer switch / genset arrangement for everything in the room? If not, how do you manage short duration power outages?
Yep, we have battery-backed generators for UPS and a transfer switch at the 480-V feed that comes into the room but it is not enough to power the compute nodes. The UPS allows cluster management nodes and the parallel filesystem (which is a small cluster by itself) to ride through full outages and other PQE.