Hacker News new | ask | show | jobs
by nickpsecurity 3779 days ago
Nice writeup. I disagree that we need chip-scale, atomic clocks. My idea was a dedicated, battery-backed piece of hardware that reliably stored time plus could sync other machines. Plugs into an interconnect with ultra-low latency. One for each datacenter.

You can plug them into each machine in the cluster periodically to sync them. Or you can plug it into a master node that connects with low-latency management interface separate from main data line. Occasionally, time server gets exclusive access to that line, assesses latency, and then syncs its time. Time server might be custom built to avoid its own skew or keep one of the timekeeping devices attached. Those are periodically shipped to a central location to resync themselves against an atomic clock or each other.

What yall think?

2 comments

> My idea was a dedicated, battery-backed piece of hardware that reliably stored time plus could sync other machines. Plugs into an interconnect with ultra-low latency. One for each datacenter.

Google have a variant on this, where they use a GPS receiver in each data centre to provide an accurate time source for local machines.

I've seen this being used extensively in sensor control centers (power grid/power plants), so it's definitely not limited to Google.

Basically by accurate and precise timing, they are able to reconstruct and pinpoint the origin of a failure.

No, they use a GPS with 7ms time spread for servers as you said. What Im describing is a custom device set against an atomic clock to nano or microsecond accuracy which then does the same to the servers via low latency interconnects. Optionally with time server & dedicated networks for reduced admin overhead.

Should do a lot better than 7ms with performance implications.

It's actually a mix of GPS and atomic clocks for diversity. See the Spanner paper section on TrueTime for details
That's already a thing, Google GPS based NTP servers with a TCXO or Rubidium reference. You can buy one for ~2000$ that will use GPS, or Cellular as a time reference, and can add options like a Temperature controlled crystal oscillator (TCXO), or a Rubidium reference (basically, small atomic reference).
Interesting. So, are you saying it's a start on my idea or what I'm proposing? As in, can it currently sync the servers in multiple datacenters to the point they could operate with microsecond spreads?