Hacker News new | ask | show | jobs
by pilif 4675 days ago
>Many web services choose to return dates in something other than a unix timestamp (unfortunately)

Wrong. That's very fortunate. Unix time stamps have some serious deficiencies as data type for storing time information: for one, they lack precision. One second just might not do it. Then they lack any time zone information. You will never know what a specific time stamp is in. GMT? UTC? Time zone where the server is in?

Sure. Maybe you are lucky and it's documented (it probably isn't because people who care about such things are not using unix time stamps to begin with), but using a string time stamp formatted in ISO means that no documentation is needed. The encoding is good enough to store any sub second time stamp including time zone info.

That way, you can turn any of these into whatever your environment uses internally which you will then use in conjunction with the library routines to deal with all the difficulties related to doing math with dates (how many days in a month? What about leap years? What about time zones? Not really hard issues, but many to keep in mind and many possible causes for bugs)

7 comments

You are criticizing things which you do not understand (and getting upvoted for it on HN, which is a little disturbing).

As others have mentioned, Unix Timestamps can be arbitrarily precise by adding arbitrarily many places of decimal precision (and this is common practice, supported by the Unix "date" command, among other things).

Secondly, Unix Time is an absolute timescale that is not relative to any time zone. A Unix Timestamp alone unambiguously (1) identifies an absolute point in time; there is no need to involve time zones, which are a political concept. A Unix Timestamp can be converted to any timezone and vice-versa. Any representation that is based on civil time is going to be more complicated and have more edge cases.

Thirdly, time zone offsets like -03:00 do not actually specify a time zone; they specify a time zone offset. These two are not the same thing. There are multiple time zones that can have a -03:00 offset, depending on the time of year. Even given a specific time of year, the time zone offset may not uniquely identify the time zone. For example, Arizona doesn't do daylight savings, so if you see a -07:00 time in the summer it could either be a PDT time (used on the west coast) or a MST time (used in Arizona).

Unix Timestamps have many advantages over text-based timestamp representations. They are much simpler to parse and have far fewer lexical variations. They are never invalid (whereas text-based dates like 2000-01-32 can be). They can be stored directly in a numeric variable. You can perform math on them directly.

(1) Except for leap seconds

I can't upvote this high enough. Anybody who thinks iso8601 dates are a good way to store time should not be allowed to handle time.

The same hour does also not occur twice in unix timestamps, though it does in most timezones (but not time offsets). Conversion rules are a mess, and have changed over time.

The other major advantage of using ISO 8601 is it's human readable. Very few people are going to be able to look at a Unix timestamp and convert it in their head (...if you can that's a good party trick).
I deal with epoch timestamps on a daily basis, and this is my go-to command:

  $ date -d @1378585039
  Sat Sep  7 20:17:19 UTC 2013
You must go to weird parties.
Another good reason to avoid POSIX timestamps: they ignore leap-seconds. Thus, to determine how many (for example) days are between two timestamps, you need an up-to-date database of leap-second insertions. Maybe an error-bar of a few seconds doesn't sound like much, but if that error bar happens to straddle midnight and (like most code) you get a date-stamp by truncation, you could be off by a day. If that error-bar happens to straddle midnight on December 31st and you're truncating to month or year values, you could be out by a whole lot more.
It doesn't matter either way. If you wish to account for leap seconds then the device that creates the timestamp needs access to that very same database.

The advantage of ignoring leap-seconds on the recorder is you can map any sufficiently precise monotonic clock to UNIX time with a simple linear equation. Personally I think it makes a lot more sense to keep the complexity contained to the decoder, rather than the encoder where bugs could mean you end up not recording an accurate timestamp to begin with.

Why would you ever care about leap seconds when calculating the number of days between timestamps? Leap seconds are necessary to calculate the exact number of seconds between two timestamps. But a day isn't exactly 86,400 seconds on leap second days, it's a little bit longer. So the simple algorithm for calculating the number of days between timestamps (floor((ts2-ts1)/86400) seems more correct than anything that takes leap seconds into account.
A leap second only lengthens one day.

In any event POSIX time stamps are fine w.r.t. leap seconds, it's the conversion functions which may or may not reflect them.

Actually the ISO format does not give you timezone information. It gives you offset from UTC, from which you cannot infer anything about the timezone in which the date resides.
A string timestamp needs more documentation, because there are many subtly different string formats. Does it refer to a specific instant, or a political time notion? What separator is in use? Are leap seconds a possibility?

The ISO date format's notion of timezones is a compromise that gives you the worst of all worlds. They complicate referring to a physical instant, because you can refer to it in several timezones, rather than the unique representation of a unix timestamp. But they're inadequate for political time, because what time comes 6 months after 13:00 (+00:00)? (It could be 13:00 (+01:00) or 13:00 (+00:00) or likely others - you need a symbolic timezone like "Europe/Lisbon").

For physical or "system" times unix time is great (unless you need the greater precision, but how common is that?). For user-facing times ISO is inadequate. The use case for ISO string datetime formats is very narrow.

"They complicate referring to a physical instant, because you can refer to it in several timezones, rather than the unique representation of a unix timestamp."

I thought it's supposed to be an external format. I'd always expect a computer system presenting the output information to the user in his local time zone, while accepting inputs from all time zones equally.

"But they're inadequate for political time, because what time comes 6 months after 13:00 (+00:00)? (It could be 13:00 (+01:00) or 13:00 (+00:00) or likely others - you need a symbolic timezone like "Europe/Lisbon")."

You can't standardize a changing practice. I'd never expect it to deal with these issues.

>I thought it's supposed to be an external format. I'd always expect a computer system presenting the output information to the user in his local time zone, while accepting inputs from all time zones equally.

We're talking about a web API here, not user display. But even so, IME users don't think of their timezones as "+8" or the like, so for human I/O you want to use symbolic timezone names, not offsets.

Precision can be fixed by just adding a decimal point. And a "UNIX time stamp" doesn't need a time zone because it's always UTC.

However, you're overall point there remains valid, because people will try to pass off something as a "UNIX time stamp" that is actually in a different time zone. There is value to self-describing data.

It's frequently important to preserve the original time zone offset of a timestamp. Sending everything as UTC loses that information.
There is nothing preventing you from sending a time zone offset (or an Olson timezone id like 'America/Los_Angeles') along with the Unix Time.
Sure no problem. We just have to invent a new format.
That's what JSON is for:

{timestamp: xyz, tz: "America/Los_Angeles"}

Unix timestamps are always UTC (GMT)

Quote:

> Unix time, or POSIX time, is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970

More importantly (the GP missed this as well), Unix timestamps can't convey local time. Local time has UI implications, e.g. the query "is this event on a weekend" is not generally answerable without the time zone.
For historical dates, I'd rather everyone knew how to convert accurately to and from UTC (2 conversions), rather than relying on everyone to have a bug-free and up to date implementation of 2N(N-1) conversions.

That said, the exception for local time, at least in my opinion, is agreeing on dates in the future meant for human interaction (e.g. "I'll meet you at 7 AM local time in Time Square on the 3rd of April 2068"). Here time zone rules may actually change before the date transpires, and you can't be sure of the representation in any other zone or format until closer to the event.

Entirely true, but not necessarily relevant.

Local time is a weird thing and changes all the time.

For giggles, look at the history of timezone rule changes in tzdata.

Most timezones have at least one duplicate hour per year (IE the same time occurs twice) in the US as well.

Local times are not an appropriate way to store time.

Note: ISO8601 does not give you local time anyway, since you cannot infer the timezone from the time offset.

No, that's exactly my point. Sometimes you want an event to occur at 9 AM EST, regardless of how that translates to UTC.

That you cannot infer the timezone from the time offset in ISO 8601 is a good point though.

If you've ever moved country, and imported some data from the old country and mixed it with data from your new country, it quickly becomes obvious why preserving source timezone is a deeply useful attribute ("It's a beautifully sunny day! -- me, 04:17hrs").

Or at the very least, preserving the time offset.

Uhm, UTC != GMT Nobody in this comments thread knows what they're talking about and I wouldn't trust anyone here to program anything to do with time.
As far as I can tell, GMT is now UT1 which may not diverge from UTC by more than one second. I don't think conflating the two is especially egregious in this context.
My experience suggests that's not always the case :(.
This is a standard as well as a convention. People breaking it are idiots. Most software out there assumes UTC time when dealing with Unix timestamps. Most Java date libraries handle this perfectly and get the correct timezone.