Hacker News new | ask | show | jobs
by bryanrasmussen 2586 days ago
Ok, I don't understand though why the bug hasn't been fixed and is there any other widely used time localization library that makes the same mistake - not just in python but other languages?
4 comments

I've updated the post to clarify that one should not use `pytz` at all and should instead use `dateutil`. The latter has much more reasonable behavior and is more actively maintained. I also included a link to commentary about pytz by dateutil's current maintainer.

Why hasn't the bug been fixed? I'm not sure! But looking at the code it seems like this problem has existed for the past 9 years. My guess is that the developers and maintainers never saw it as an actual problem, despite its ubiquity.

I believe the problem comes from a conceptual mismatch of what a timezone object should be. The datetime people envisioned a dumb object that just contains some constants, not the dynamic objects produced by pytz. Using .localize as emphasized in the pytz documentation solves the problem completely.

It would be less of a problem if pytz defaulted to the last offset in the database rather than the first.

I think your concept of "dynamic" and "static" is exactly backwards from mine. datetime specifies an API (tzinfo) for dynamic objects, so that a single object provides the time zone information for many datetimes. pytz's localize function attaches a different static object to each datetime, depending on which one is appropriate.

pytz defaulting to the last offset instead of the first would cause a lot more silent breakages, because it would probably be right about half the time, and wrong about half the time, and even when it's wrong it won't be obvious. Defaulting to the first value in the list is as close as pytz can come to failing loudly without actually raising an exception, because it's obviously wrong nearly 100% of the time.

Don't guess, fail explicitly.
> I believe the problem comes from a conceptual mismatch of what a timezone object should be.

There seem to be two commonly understood meanings of timezone which are confused in UIs and APIs: a geographical area which follows a particular winter and summertime regime over the year; and a particular offset from UTC like GMT or BST. I've seen a lot of bad design rooted in this confusion.

Just like people don't use the http module but requests, it's been years since the community moved away from manual manipulation of datetime/pytz for time zones.

Nowadays people use higher level libs such as pendulum:

    >>> print(pendulum.datetime(2019, 5, 21, 12, 30, tz='America/New_York'))
    2019-05-21T12:30:00-04:00
It avoids many gotchas, gives you more features and has a nicer API.

Like skilpat said, dateutil is a better fit that pytz, and hence pendulum uses it, as well as pytzdata, to stay up to date.

{{citation needed}}

Your assertion reminded me of this funny dialog about JS: https://hackernoon.com/how-it-feels-to-learn-javascript-in-2...

"pip install pendulum" is not complex.

Using pendulum is not complicated.

You can skim the doc in 5 minutes, your intern can do it too.

This is one of those tools that removes complexity when you use them.

Also, pendulum and the stdlib datetime module are compatible, making migration painless:

    >>> pendulum.now() - datetime.now(pendulum.now().tz)
    <Period [2019-05-26T16:57:35.872732+02:00 -> 2019-05-26T16:57:35.872141+02:00]>
In the end, pendulum doesn't requires you to install a transpiler, 100 plugins and create a configuration file like the post you link to. But it does save you from bugs, and you don't need to be an expert in time to use it.

I see only wins.

Just to be clear, I was not meaning to imply that pendulum isn't better, but rather that not everyone uses it. Nor does everyone use requests, although it's very clearly better than what's on the standard library.

Discoverability is a very big issue in general, and with these libraries in particular. I for example have been working with Python for almost a decade and follow the community quite closely, but can't recall hearing of pendulum before. Last time I investigated this, the cool libraries to use were arrow and delorean. How are people expected to keep up to date with every cool new thing?

> "pip install pendulum" is not complex.

For much of the work I do, there's a _big_ jump in complexity from using python (2 or 3) and its stdlib vs. requiring a library. A script using only the stdlib is easy to distribute and get working on developer, ci, and production machines. Once an external library is required, I need machinery or scripts to manage or ensure presence of those dependencies.

It's probably no good to you, but one of the reasons I like NixOS is the simplicity of creating single file scripts including dependencies. For example, something using Pendulum and ffmpeg together would start like this:

    #!/usr/bin/env nix-shell
    #!nix-shell -i python -p python pythonPackages.pendulum ffmpeg
Then you just put the code after that.
I love Nix and NixOS personally... but it's been a tough sell at work, unfortunately.

Even with simple shell scripts... it's so easy to invoke programs with GNU extensions and later find they fail on a co-workers macos machine. And I often have a shell.nix sitting right there that defines the complete dependency closure; very frustrating to not be able to use it.

OK that's pretty cool
It was compared to the linked article.

Besides, for time zones, you can't do otherwise: it's not in the stdlib. So compared to pytz, it's easy.

Unrelated, but I highly recommand pex for script with dependancies. It will turn it to a one file bunddle of all required modules, and all you need on the server is the same oython version.

It should be in the stdlib really.

Sure, IF you know that you should be using datetime ^D^D^D pytz ^D^D^D dateutils ^D^D^D pendulum.
That's true for absolutely everything in programming.
In other situations, flaws and enhancements to a library would be handled with new API/versions on that library itself. Not (an|a series of) entirely separate librar(y|ies). Big difference in discoverability.
I now have a deeper appreciation for front-end (front-line?) developers.
There should be a way to actively discourage users to keep away from these old libs that aren't used. There's a significant group of new developers coming to python as their first language and first coding use. The kind of crap (in the article) is exactly why programming used to be a total nightmare. The only tool I've found that helps with this (albeit in Java) is IntelliJ. What do people use for python?
Datetime is a good module if you don't deal with timezones, which is most of the time. And it doesn't come with out of the box support for timd zones, so either you code it, or you use a third party lib.

Hence we don't discourage people from using datetime, it's in the stdlib, and it's useful.

However, if you need time zones, either you code something yourself, in that case you are supposed to know what you are doing, or you look up the best libs for the job.

For the last part, no language have a perfect answer. It's an organic process. I've never met any tool solving it, not even intellij.

I had no idea pendulum existed, is there some way I should have looked it up? None of the code in tz-aware packages I've ever read used it, for example.
Looking up is not a solved problem no.

I usually check the "awesome python list", ask on reddit and twitter, and then do some google foo. I select 3 packages and do some tests.

I got nothing better than that.

Another good source is find talks at recent PyCon or PyData conferences that survey that topic, then pick the packages they recommend.
Good summary. I think one of the unspoken concerns is with the "look up the best libs for the job" step. How do we ensure that the up-to-date information is easily found in that lookup? Stack Overflow is a huge search sink for stuff like this, and has never taken the deprecation/evolution problem very seriously.

Maybe not in this exact case of dateutils vs. pendulum, but it's really easy to find outdated information on the web and struggle to confirm whether it's still the best answer.

> Nowadays people use higher level libs such as pendulum

I don't think Django pulls in "pendulum" (someone please correct me if I'm wrong).

If Django isn't using it, I have to question how relevant "pendulum" actually is.

Django tries to keep as little dependancies as possible.

Personally when I use django, I also use pendulum at the same time. Django timezone management is limited to storing, retrieving and formatting time zone aware dates, but for:

- calculations

- moving from one time zone to another

- display of relative time

- parsing

- interval manipulations

Pendulum is making the job a breeze.

Let's put back things in context: most people don't need those features at all. And most people won't be affected by the bug in the article.

Use pendulum if you need it, which is quite rarely.

When I say people usually use libs like it, I mean people that have specific tasks that requires it. Few people actually do.

It's like asyncio, or "is python fast enough" and other things like that.

Is arrow and pendulum same?
Sébastien Eustace created pendulum specifically to address the deficiencies of arrow: http://blog.eustace.io/please-stop-using-arrow.html
This seems very much to underscore last week's comments by Amber Brown re: the problems of the batteries which are included.
> Ok, I don't understand though why the bug hasn't been fixed and is there any other widely used time localization library that makes the same mistake

Because the issue is in the way datetime interacts with timezone objects (it assumes timezone objects are dumb and there's no co-dependency, which is incorrect), and fixing that would require changing the protocol, which would break existing libraries.