| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hmate9 2299 days ago
	This is why no one should ever ever write their own Time or Date library. The number of edge cases is simply enormous

5 comments

Misdicorl 2299 days ago

Almost all the edge cases with time have to do with localization.

Build your time library on (un)signed 64 bit integers representing the number of nanoseconds since the utc epoch. Adjust above sentence to reflect the level of precision and range your use case needs. You are now done for 80% of use cases (perf timing, logging, timeouts, event storage, event ordering within jitter).

If you need to parse/display for humans or have something happen at a particular time in a particular timezone, things get gross. But that's no different than any other situation where you eventually have to interface machine data with humans. Either it's your particular expertise or it's a distraction and you should use someone else's solution.

earthboundkid 2298 days ago

> Build your time library on (un)signed 64 bit integers representing the number of nanoseconds since the utc epoch.

You poor sweet summer child. Has no one told you about the leap seconds yet? Unix time is not the number of seconds since epoch. It deliberately excludes leap seconds, which happen unpredictably whenever scientists measure the Earth as having spun at a different enough speed for long enough.

Time is fucked on every level:

- Philosophical: What is time? We just don't know.

- Physical: Turns out there is no such thing as simultaneity, and time flows differently at different locations. Time may be discrete at the Planck level, but we don't really know yet.

- Cosmological: The Earth does not rotate at a constant speed, the Earth does not orbit the Sun at a fractional component of its rotation, and the Moon does not orbit at even ratio either.

- Historical: Humans have not used time or calendars consistently.

- Notational: Some time notations are ambiguous (e.g. during daylight savings transitions) and others are skipped.

- Regional: Different regions use subtly different clocks and calendars.

- Political: Different political actors choose to change time whenever they feel like it with little or no warning.

- Religious: Many religions come with their own system for timekeeping, and people don't like when outsiders impose other systems.

perl4ever 2298 days ago

"You poor sweet summer child"

haha. Google is way ahead of you and your "leap seconds".

"Since 2008, instead of applying leap seconds to our servers using clock steps, we have "smeared" the extra second across the hours before and after each leap. The leap smear applies to all Google services, including all our APIs."

https://developers.google.com/time/smear

...now write code to convert Google time to any other random type of time.

c3534l 2298 days ago

Both

https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd...

and https://en.wikipedia.org/wiki/Unix_time

indicate that leap seconds are not included in Unix time as seconds-since-the-epoch. Leap seconds are included in UTC, and thus the Unix time appears to skip a second relative to UTC.

zokier 2298 days ago

> Unix time is not the number of seconds since epoch

I assumed that is (part of) why he ended up wanting to build his own. The way UNIX time handles leap seconds was arguably a mistake, GPS (and Galileo) time does it right; have the leap second information in a separate field. So as a time scale I imagine OPs one would be similar to GPS but with different epoch. Of course everyone loves having informally defined ad-hoc timescales around

mytailorisrich 2298 days ago

> Unix time is not the number of seconds since epoch

In fairness, he said to count the number of seconds since the Epoch. That is independent of UTC.

That means using e.g. TAI. Unix time also ignores leap seconds since it counts the number of seconds actually elapsed.

labawi 2298 days ago

> Unix time also ignores leap seconds

hence it counts the number of seconds that would have elapsed, if they didn't exist. In effect, it's timescale has time that never happened, and time that happened twice.

burfog 2298 days ago

You can still have problems.

It may be determined that the computer's clock got too far ahead. For example, it booted and you cared about time, but NTP hadn't yet made corrections. Suddenly the time runs backwards.

Leap seconds may get interesting too, especially if you have to predict ahead or if the OS isn't updated often enough. (there is a 6-month warning) If you want to call leap seconds an issue for humans, then you aren't using UTC at all. You're using TAI. Software interfaces often ignore the distinction between UTC and TAI, and even between UTC and UT1, preferring to pretend these issues don't exist. POSIX is in conflict with international timekeeping, effectively requiring that there are zero leap seconds.

codetrotter 2298 days ago

> Suddenly the time runs backwards

It is my understanding that the way NTP deamons work is that time is never adjusted backward. Instead, the ticks are "slowed down" on the local machine until it is in sync with the NTP time. However, if the difference is too great then I think NTP deamons might refuse to correct the time all together. So then, if my understanding is correct, your machine is "stuck in the future". But it will never make a jump backwards because of NTP.

However, I am not familiar with the intricate details of NTP so do take this with a grain of salt.

twic 2298 days ago

My understanding (i am also not an expert!) is that common NTP implementations will do the slowing down ("slewing") to correct small errors, but will just change the time ("stepping") to correct large errors. It will even do this if that means time going backwards.

Subject to configuration, of course. man ntpd [1] says:

> Sometimes, in particular when ntpd is first started, the error might exceed 128 ms. This may on occasion cause the clock to be set backwards if the local clock time is more than 128 s in the future relative to the server. In some applications, this behavior may be unacceptable. If the -x option is included on the command line, the clock will never be stepped and only slew corrections will be used.

[1] https://linux.die.net/man/8/ntpd

LukeShu 2298 days ago

timesyncd also does this for "large offsets", but what "large" is is neither configurable nor documented, but in the source:

    /*
     * Maximum delta in seconds which the system clock is gradually adjusted
     * (slewed) to approach the network time. Deltas larger that this are set by
     * letting the system time jump. The kernel's limit for adjtime is 0.5s.
     */
    #define NTP_MAX_ADJUST                  0.4

fao_ 2298 days ago

> neither configurable nor documented, but in the source:

Build it yourself and pass -DNTP_MAX_ADJUST=XX

Misdicorl 2298 days ago

Leap seconds are only a problem for wall clocks (localization) and deciding whether to call something utc or tai.

The ntp case is contrived. Either you care and wait until ntp has connected to do your stuff. Or you care and don't let ntp rewind and instead smear. Or you don't care and deal with the consequences.

mortehu 2298 days ago

Leap seconds affect Unix time too, because leap seconds are excluded from "number of seconds since Unix epoch". You can't measure the length of time intervals spanning leap seconds with simple subtraction.

lmm 2298 days ago

> Leap seconds affect Unix time too, because leap seconds are excluded from "number of seconds since Unix epoch".

They are excluded from "unix timestamps". They would still be part of the number of seconds since Unix epoch.

labawi 2298 days ago

What does that mean and how would that work?

AFAIK, unix time skips a beat or repeats itself to remain in alignment with UTC, and just keeps chugging along.

Misdicorl 2298 days ago

Are they truly? There must be a good reason but I can't fathom what it could be

GolDDranks 2298 days ago

That good reason is that there is no general "formula" for leap seconds, unlike for leap years, they have to be looked up. So you can't do "offline" date calculations if they included leap seconds.

I think that UNIX time stamps are generally a very good approximation, and if you are comparing long enough time intervals for the error to get over one second, and/or that error to matter, you are doing something wrong anyway.

For exact time interval measurements that you have to get exactly right, don't use UNIX time stamps.

heartbeats 2298 days ago

Because otherwise you would get annoying off-by-27 errors whenever you did time() % 60.

https://en.wikipedia.org/wiki/Leap_second#Binary_representat...

nitwit005 2298 days ago

It depends on the purpose of the app, but you often have to store user entered times with time zone information. The rules for things like daylight savings time can and do change, and you don't want future scheduled events drifting by an hour.

gmueckl 2298 days ago

Let's not forget events spanning time zones. Just some random thing that came to my mind: how would you handle calendar entries where half the participants made a DST transition since the entry was created and the other half didn't? This happens for example when half of the team is in the US and the other half in the EU. The transition dates are a week apart.

perl4ever 2298 days ago

I remember seeing a thick dead tree type of book with the history of time zones in the US, for figuring out times in historical documents when things were less standardized. It was practically the size of a phone book; I think it probably covered county level history or something like that.

macintux 2298 days ago

I have a book covering, among other things, Indiana time zones for a few years during IIRC the 1960s.

It’s frankly amazing how much they changed every year. Different counties, and sometimes towns/cities within counties, would jump back and forth year to year. It would have been awful to manage if computers had been more important at that time.

Misdicorl 2298 days ago

None of these edge cases are solved by using off the shelf libraries instead of running your own.

gmueckl 2298 days ago

None of these edge cases have a chance of being handled correctly without an accurate time zone database to detect the problem at all. Good luck maintaining that on the side! I am fairly certain that you would get it wrong. There's a reason why most software relies on zoneinfo.

Misdicorl 2298 days ago

Yes, as I said above, if you need to localize then you'll need more detail.

e12e 2298 days ago

When do you not?

lmm 2298 days ago

When you need some system activity to happen regularly (e.g. every 10 minutes) but not at a specifically human-meaningful time. When you need to know what order events occurred in or how far apart two events are, but don't need to correlate those times with external events (or can easily establish "system" times for any relevant external events). You can cover a lot of cases without having to touch "human time" at all.

mytailorisrich 2298 days ago

I've come to the same conclusion as you: Keep time as a purely monotonic integer for everything and convert that as needed to display to humans.

This also pushes all the madness to the edges and out of the business logic.

That being said, this works well for applications that are not "date intensive" so to speak. If your business logic has to deal specifically with calendar dates, e.g., monthly events, then you have to deal with calendar months and all that this involves, including explicitly dealing with the 29th February.

andrewfong 2298 days ago

You want to store both local time (and timezone and/or some proxy for location if possible) and Unix / UTC / integer time. The latter is what your application relies on 99% of the time, but if, say, a given country gets rid of daylights savings time (as Brazil did last year), having the local time is helpful for recomputing your Unix time.

nradov 2298 days ago

The monotonic integer approach doesn't work for most healthcare applications. Due to safety and compliance requirements we typically need to record both the local time and the zone offset which applied at that instant.

Misdicorl 2298 days ago

This is surprising for event recording since they should be equivalent but time + zone is strictly more likely to be messed up. E.g 2:30 a.m. is two times during daylight changeover so you must correctly specify est or edt.

ajnin 2298 days ago

You're assuming a source of monotonic nanotime is easy to get and will always be available. This is not the case. As far as I know you have either "wall time" as usually defined, with all the problems associated with clock drift, NTP syncs, computer going into sleep mode, etc, or "nanotime" which is some time delta from some arbitrary reference and which can only be used to compute time differences and not arbitrary instants in time. One can not be converted into the other easily, or at all.

Misdicorl 2298 days ago

You use the arbitrary reference and sample the wall clock a couple times to interpolate a best effort offset from utc 0.

If you care more than this, you'll be displeased with off the shelf solutions too

ken 2298 days ago

Your percentages are backwards, IME. Timeouts and logging and such are definitely not “80% of use cases”. Interfacing with humans and human systems are.

Misdicorl 2298 days ago

I disagree. Time is pervasive and will exist in every application in some way. It will only matter to the user in a small subset of them

jmcqk6 2298 days ago

I"m just going to leave this here: https://infiniteundo.com/post/25326999628/falsehoods-program...

PopeDotNinja 2298 days ago

Immediately thought of Tom Scott's Computerphile video on time & timezones:

https://youtu.be/-5wpm-gesOY

10m12s watch. Informative and entertaining.

velox_io 2299 days ago

Ask Jon Skeet: https://blog.nodatime.org/2011/08/what-wrong-with-datetime-a...

hyperpape 2299 days ago

When you encounter someone saying "no one should ever", you can substitute "fewer than a half dozen groups of people should tackle this problem (in systems they want to use in production). It will take each such group a tremendous number of hours, involving a multi-year process of slowly finding edge cases and missing functionality."

If that's not true, then there's room to complain, but dates and times fit the bill. In some ways they're worse than other "harder" problems, because people are more likely to think those harder problems are too hard for them. And while the vast majority of companies don't need a proprietary database, it's more likely to be a competitive advantage than your own datetime library.

I think I could eventually write a good datetime library. But I certainly should not, unless I decide that's going to be one of my major efforts to help a language that doesn't already have one.

maxerickson 2299 days ago

What if people tried to speak precisely for effect instead of hyperbolically for emphasis?

jrandm 2299 days ago

I believe accurately conveying identical information in a manner everyone can understand is an impossible, or at least unreasonable, burden to place on individuals.

Instead, what if people tried to hear graciously and "assume good faith"?

The quote is ripped from HN's guidelines.

maxerickson 2298 days ago

Can you please explain more about how your response relates to my comment?

For instance, I've not placed the burden of "conveying identical information in a manner everyone can understand" on anyone, nor have I assumed bad faith. So it seems like a weird comment to tack onto mine, but I am assuming I just don't correctly understand.

IggleSniggle 2298 days ago

Hyperbole and precision can get tricky in human language because different cultural groups have different language encodings for the same sets of words.

For example, if you live in London then "9 in the morning" means when the world synchronized clocks agree that, locally for you, the time is 9am. But if you say "9 in the morning" to someone in Belize, it means "first thing after you are finished with your morning and ready to start your day," which can mean 1pm in some cases.

Here's a lovely article on these kind of time-keeping differences, around something that you might expect to have a precise meaning:

https://www.businessinsider.com/how-different-cultures-under...

More to the point at hand, however, you suggested that people not speak in hyperbole but instead speak accurately. Although you can request that others adjust their use of language while in your presence to better meet your needs for a certain kind of precision, policing other people's language isn't possible. However, re-interpreting what people say into what they mean is somewhat possible for an astute listener who understands the context.

jrandm 2298 days ago

The sibling replies by IggleSniggle I fully endorse as supporting my meaning in the original post.

As to explaining more about my thought process: You asked a hypothetical "what if" question which, given the question itself is imprecise, I interpreted as you wishing information was always conveyed to your desired precision/accuracy, and extrapolated that (given this is a public forum) into general communication.

I shared my thoughts, specifically that it seems impossible for a person to always communicate perfectly to an unknown audience, and offered a different hypothetical. The last line explains the punctuation use in my "what if."

wwweston 2298 days ago

Precision of understanding would go up and emotional investment in a topic would likely be more proportional to true stakes?

Sounds boring but functional.

ehsankia 2298 days ago

It's not only about writing your own library though. For example, it's often not reading the documentation properly.

In Python, if you take a datetime, and call .replace(year=X) on a datetime for Feb29, it'll throw a ValueError.

giovannibajo1 2298 days ago

I think .replace() is a mistake, and it shouldn't exist in the first place. The way dates work, replacing a single component is almost always going to create problems in specific cases.

wruza 2298 days ago

And then you have to implement business directives with “the same date in 2025” in documents already signed by all parties (and no one got confused, except math guys). Replace is not a mistake, it just should state what it does, so that a programmer could test and use it.

ehsankia 2298 days ago

It does state what it does, but people rarely read documentations, and unlike static languages, Python doesn't force you to deal with the exception. And since it's a very rare problem, most people don't catch the bug.

perl4ever 2298 days ago

What if you take March 31st and add one month to it? (I assume you can do that somehow)

deathanatos 2298 days ago

Not with the standard library. (The "timedelta" class only expresses in units up to days, as it is ambiguous how long a month is.

There's a nice package called "python-dateutil" that includes a "relativedelta" class; adding a month to March 31 results in April 30:

  In [7]: datetime.datetime(2020, 3, 31) + dateutil.relativedelta.relativedelta(months=1)                                
  Out[7]: datetime.datetime(2020, 4, 30, 0, 0)

Adding a year to a leap day:

  In [8]: datetime.datetime(2020, 2, 29) + dateutil.relativedelta.relativedelta(years=1)                                 
  Out[8]: datetime.datetime(2021, 2, 28, 0, 0)

The exact duration that relativedelta adds depends on what you add it to. (Hence the name.) But the results tend to match up with human expectations.

wruza 2298 days ago

You get April 30. Idk about all libraries, but date-fns and few others I worked with do it right.

Also, if you add one month to Apr 30 and two months to Mar 31, you’ll see that month addition is not commutative-y and one should operate on distances from a base date, not in an incremental way.

Aeolun 2298 days ago

Even if you replace it with another leap year?

ehsankia 2298 days ago

No another leap year would be fine. But yeah a very common (wrong) pattern is, if you want to find the same day a year in advance, is to do `d.replace(year=d.year + 1)`, and that would break on Feb 29 only, so one day every 4 years. It's a very common pattern unfortunately.

cgriswald 2298 days ago

It returns a new datetime object which includes its own validity check and raises an exception for a bad leap day the same way it would for March 32nd or Blurnsuary 12th. (However, year must be an integer in [1, 9999].)

Edit: So, another leap year would be fine.

76543210 2299 days ago

How about C rtos? How should that be handled?