Hacker News new | ask | show | jobs
by erichocean 1404 days ago
Thank you for publishing the TLA+ model, that dramatically increases my level of trust in Neon. (I'm on the early adopter list, got my invite a week or two ago but haven't been able to give it a spin yet.)

> Right now, such a change requires humans to be in the loop to ensure that the old safekeeper is actually down. It is on our roadmap to automate this procedure.

If you do implement this (which I don't recommend), be certain to also model it with TLA+. This level of automation, IMO, requires a human in the loop + a ton of visibility tracking on when it is happening.

A good way to roll it out is "semi-automation"—implement the automation but use it to ask a human to approve. The human will then do the normal (manual) verification. After you've run that successfully for a year, and your TLA+ model passes, you can then decide to fully automate without a human in the loop.

Otherwise, you're asking for an outage (caused by bad failover) IMO, and possibly data loss.