This blog post describes exactly the scenario we were experiencing here. A master (single writer) failure, with missing fail over. You can only guess what went wrong with this plan. Looks good on paper, but some unexpected network or HW or routing problem could have caused the problem to identify the single writer.