Hacker News new | ask | show | jobs
by justin_ 767 days ago
> UNIX time counts the number of seconds since an ``epoch.'' This is very convenient for programs that work with time intervals: the difference between two UNIX time values is a real-time difference measured in seconds, within the accuracy of the local clock. Thousands of programmers rely on this fact.

Contrary to this, since at least 2008[0], the POSIX standard (which is just paper not necessarily how real systems worked at that time) has said that "every day shall be accounted for by exactly 86400 seconds." That means that in modern systems using NTP, your Unix timestamps will be off from the expected number of TAI seconds. And yes, it means that a Unix timestamp _can repeat_ on a leap second day.

There's really no perfect way of doing things though. Should Unix time - an integer - represent the number of physical seconds since some epoch moment, or a packed encoding of a "date time" that can be quickly mapped to a calendar day? "The answer is obvious" say both sides simultaneously :^)

EDIT: I know DJB is calling out POSIX's choices in this article, but it seems like his "definition" does diverge from what the count actually meant to a lot of people.

[0] Also: "The relationship between the actual time of day and the current value for seconds since the Epoch is unspecified." https://pubs.opengroup.org/onlinepubs/9699919799.2008edition...

3 comments

> There's really no perfect way of doing things though. Should Unix time - an integer - represent the number of physical seconds since some epoch moment, or a packed encoding of a "date time" that can be quickly mapped to a calendar day? "The answer is obvious" say both sides simultaneously :^)

Either way arguably POSIX time is the worst of both worlds, not being either of those. This is one of those cases where middleground compromise is worse than either option. Having datetime be just 15:17 bit structure (date:time_of_day) or more practically just int:int struct, it would be infinitely better than current POSIX time. Or just have it be plain SI seconds since epoch counter would also be better than POSIX time. Or even have both options available. There are so many better options, it is almost baffling how POSIX ended up with the worst possible one

I've spent literally hundreds of hours thinking about this, and I've landed on the opinion that TAI should always be the source of truth. It's usually not, because most systems ultimately get their source of truth from the GPS system, which uses something that looks kind-of like UTC (but is actually more similar to TAI, but people prefer to dress it up as UTC). This was a reasonable mistake that has caused so many problems due to the leap seconds.

If you're using TAI, it's more obvious that you have to account for leap seconds in order to convert to a human-readable civil time format (year, month, day, hour, minute, second). You know that you have to incorporate the official leap second list from IANA or some other authority. UTC makes it seem simpler than it actually is, and that's why there are so many problems.

You can always convert from TAI to civil time. You cannot always go the other way.

My opinion is to use a dual structure which is how I had considered in my operating system design: a signed 64-bit (or possibly longer) number of seconds, and a 32-bit number of nanoseconds (which is not always present; sometimes the precision of nanoseconds is unknown or unnecessary, so it would be omitted). There are then separate subtypes of timestamps, such as UTC (which allows the number of nanoseconds to exceed one billion) and SI (which does not allow the number of nanoseconds to exceed one billion). Which is appropriate depends on the application.
That's an interesting approach, but not without pitfalls. You can't just subtract two timestamps to get a time difference, since there may have been one with 2 billion nanoseconds somewhere in between.
This is true. However:

- You can subtract two SI-based timestamps to get a SI-based time difference. This will give you the correct number of nanoseconds.

- You can subtract two UTC-based timestamps to get a UTC-based time difference. This will not give you a number of nanoseconds.

Note that, if you have a list of when leap seconds are, then you can convert between SI-based and UTC-based timestamps which are counted from the epoch; in this way it is possible to get a SI-based time difference from UTC-based timestamps if you need it.

It is not as simple as just subtracting them of course, but if your application is meant to use precise time differences then you can just use SI-based timestamps instead of UTC-based timestamps, so that you can simply subtract them from each other. Programs that need to deal with days and calendars instead, are likely to prefer to use UTC-based timestamps instead.