Hacker News new | ask | show | jobs
by a_shovel 1328 days ago
I'm still of the opinion that handling leap seconds by ignoring them is a dumb idea.

Unix time should be a steady heartbeat counting up the number of seconds since midnight, January 1 1970. Nice, clean, simple. How you might convert this number into a human-readable date and time is out of scope/implementation-defined/an exercise for the reader/whichever variation of "not my problem" you prefer.

3 comments

> Unix time should be a steady heartbeat counting up the number of seconds since ...

This is nice and clean, so long as you have exactly one computer. The second there are more than one, and they are talking to each other, their clocks can go out of sync with each other. And they will, because they are physical systems that are imperfect and in general much less precise than you'd expect them to be.

This means that there has to be a way to correct for errors. The best method, that almost everyone who manages a lot of computers converges onto, is to "smear out" any errors, by never discretely changing the time on any machine, but just shortening or lengthening seconds slightly to bring any outliers back to the correct values. And once you have this system, dealing with a leap second using it is the easiest, simplest and least errorprone method.

I do think that there are purposes where local "machine time", which is just a monotonic clock counting upwards from bootup, would make sense. Especially when subsecond accuracy is important. But it should always be clear that there is no way to reliably convert between that and wallclock or calendar time. There are *no* intervals of calendar/wallclock time that reliably convert to any interval of machine time. It is not guaranteed that any wallclock minute contains exactly 60 machine seconds.

I think people forget how often computers in the 80’s and 90’s were turned off, keeping track of time with a backup battery, and just how many machines had a dead backup battery and users that didn’t know any better.

I fielded tech support questions from an app that had grown to send email. Every week we got a couple of emails from some time on January 1, 1970, each from a new person, and a whole slew of people whose batteries were on the edge of failing and so their machine was days or months off from reality.

The HTTP 1.0 spec already had a solution for two machines with different ideas of the current date. It’s one of my favorite parts of the spec and I’ve used it a few times in order to avoid having to implement my own time negotiation protocols (or in fact to stop others from doing it).

I don’t think that battery chemistry has changed all that dramatically since that time. It was still a 2032 cell, for machines that have a discrete battery. Instead it’s the clock chip and network time protocols that have gotten more efficient, and we use the machines more consistently. Or at least the machines where time counts matter the most are on all the time.

> The HTTP 1.0 spec already had a solution for two machines with different ideas of the current date.

I can't recall noticing this, and can't seem to find anything about it in[1] - could you elaborate?

[1] https://www.w3.org/Protocols/HTTP/1.0/spec.html#Date

You’ve found it, or very nearly. The very next section is Expires:

> The format is an absolute date and time as defined by HTTP-date in Section 3.3.

The implication is subtle but critical. When the server sends a Date header and an Expires header, you don’t expire the content when the local time exceeds the Expires Header. You expire it at

   LocalTime + (Expires - Date)
That covers not only time zones but also clock drift. When the client is sending data such as a POST, it also sends a Date header. That can account for time zones, clock drift, and to an extent network latency. When you’re legally bound to establish the order of events in a distributed system someone has to be the source of truth, and even when you’re not it’s still good to have for your own purposes. The system of record is the only thing that is running on the same clock as the system of record, so it is the most sensible source of truth.

So when a client sends you a buffer full of dated events, you can (and should) consider the timestamps in the POST body as relative to the Date header, not your local system time. Otherwise someone running on brown power or old school power saving mode will screw up all of the timelines in your data.

> You expire it at LocalTime + (Expires - Date)

Which makes for a wonky mess, and I guess is why Cache-Control did away with the entire thing and just tells you how long the response is fresh.

Not as wonky as you'd think, and 60 minutes means something much different than 12:15. Which can be good and it can be bad, but many of my experiences with TTLs have not been pleasant. 60 minutes, as typically implemented, means some user out there has data that is 1:58 stale and he's yelling at your boss on the phone who is now trying to figure out if he should be passing the favor along.

See also ETags which stop trying to be clever about dates and instead be clever about the contents of the message.

Why I like the Date header is that it works well with REST endpoints that care about time but for whom caching is either not a good idea or is orthogonal.

There was a Cray installation in Japan in the late 80s or early 90s which was reportedly turned off at night to save electric costs.
Applications that are require clock synchronization across multiple machines should be maintaining their own clock anyways, independent of the system clock. NTP for millisecond-order precision, atomic clock time cards for microsecond-order precision.
What is the actual benefit of this?

The cost is that conversion to/from civil time is far more complicated, and worse, cannot be computed for future dates for which leap seconds have not yet been determined.

I think that 86,400-second days with a 24-hour leap smear hits a sweet spot of utility and usability: https://developers.google.com/time/smear

There are very few applications that will know or care that seconds get 0.001% longer for 24 hours.

Skipping or smearing leap seconds means conversion to/from civil time is simple ... as long as you only want an accuracy of +-1 second.

If you need better conversion, you have to know how the system chose to handle leap seconds. If it smears, you have to know and implement the smear algorithm, if it doesn't then some unix timestamps will map to two different points in civil time (second 59 and 60 in any hour with a leap second).

For 99.9% of applications, it's not a bad tradeoff. But if we're willing to fudge time by up to a second, recording time in TAI (just count a second every second) and living with future dates sometimes being off by a couple seconds when converted to UTC isn't that different in terms of utility/usability tradeoff, and conceptually simpler.

If you need better precision, you don't go converting to/from civil time, because people think on quanta of ~1 second and better precision is not defined there.

Instead, I'll put my rule here: It's perfectly reasonable to drop your time precision to ~1 second when converting it into "people time", and you should never expect better precision from a back conversion.

That's not saying you don't need TAI. It's just not reasonable to mix it and UTC et. al. If you need TAI, you live in TAI.

How many times a second do computers communicate notions of time between each other, versus to a human? We’re talking many orders of magnitude here, and inequalities like that always change the winning strategy.
It sounds like you're looking for TAI: https://en.wikipedia.org/wiki/International_Atomic_Time