|
It took me some effort to understand the issue here, so an alternative explanation in case it helps someone. First, the part that's independent of programming language. You may want to read about absolute time and civil time (e.g. from https://abseil.io/docs/cpp/guides/time) but if you don't, in short: “civil time” refers to something like “2019 May 26 at 2:45 pm in New York City” (or “in the America/New_York time zone”), which means (roughly) whatever time the locals in New York City (or a larger shared geo-political zone) would agree is 2:45 pm on that date. To convert this to an absolute time, or in other words to make sense of “2019-05-26 14:45 in America/New_York”, we need data about the real world as of that date: most obviously we need to know whether Daylight-Saving Time was in effect on that date, but also what conventions were in use at the time. (This also means it's hard to know for certain what such a notation in the future means in terms of absolute time, as possibly DST could be abolished or the dates when it comes into effect could change.) It so happens that in 1884 the conventions of New York City were such that it was about 4 minutes ahead of the then-recently standardized Eastern Time, so about 4 hours and 56 behind GMT. So, in any “correct” library, we should see the following respected: • “2019 May 26 at 2:45 pm in New York” should mean “2019 May 26 at 18:45 UTC” (timezone is EDT i.e. UTC minus 4 hours). • “2019 Jan 26 at 2:45 pm in New York” should mean “2019 Jan 26 at 19:45 UTC” (timezone is EST, i.e. UTC minus 5 hours). • “1884 Jan 26 at 2:45 pm in New York” should mean “1884 Jan 26 at 19:41 UTC” (timezone is... GMT minus 4 hours and 56 minutes). ---- Now the part that's Python-specific: the pytz library in Python provides two ways of constructing such a well-formed civil time. One is to call `.localize` on a timezone, and the other is to call `.astimezone` to convert from one civil time to its equivalent (the same absolute time) in another timezone, thus obtaining a new civil time. Both are illustrated below, showing it working properly: >>> pytz.timezone('America/New_York').localize(datetime.datetime(2019, 5, 26, 14, 45, 0)).astimezone(pytz.utc)
datetime.datetime(2019, 5, 26, 18, 45, tzinfo=<UTC>)
>>> pytz.timezone('America/New_York').localize(datetime.datetime(2019, 1, 26, 14, 45, 0)).astimezone(pytz.utc)
datetime.datetime(2019, 1, 26, 19, 45, tzinfo=<UTC>)
>>> pytz.timezone('America/New_York').localize(datetime.datetime(1884, 1, 26, 14, 45, 0)).astimezone(pytz.utc)
datetime.datetime(1884, 1, 26, 19, 41, tzinfo=<UTC>)
Unfortunately, there's a third thing a programmer can do, which the documentation warns against (http://pytz.sourceforge.net/#localized-times-and-date-arithm...), and that is to pass one of pytz's timezone objects as the “tzinfo” parameter to the standard library `datetime` function: >>> datetime.datetime(2019, 5, 26, 14, 45, 0, tzinfo=pytz.timezone('America/New_York')).astimezone(pytz.utc) # Don't do this!
datetime.datetime(2019, 5, 26, 19, 41, tzinfo=<UTC>)
>>> datetime.datetime(2019, 1, 26, 14, 45, 0, tzinfo=pytz.timezone('America/New_York')).astimezone(pytz.utc) # Don't do this!
datetime.datetime(2019, 1, 26, 19, 41, tzinfo=<UTC>)
>>> datetime.datetime(1884, 1, 26, 14, 45, 0, tzinfo=pytz.timezone('America/New_York')).astimezone(pytz.utc) # Don't do this!
datetime.datetime(1884, 1, 26, 19, 41, tzinfo=<UTC>)
which is certainly consistent in its own way, but only the last one is correct. Oops.The issue here is in the interaction between the “tzinfo” model of the standard-library `datetime` and pytz's timezone objects: the result is that when the two are used together in the above incorrect way, one ends up with a timezone that is a fixed offset from UTC, which is silly. A timezone like `America/New_York` is not a fixed offset from UTC: not only does it change twice a year, it also has changed in arbitrary ways in the past, and may change in arbitrary ways in the future. (Note that “fixing” the offset of 4 hour 56 minutes to 5 hours would not solve any problems as it would still be wrong many months of each year — arguably, having an obviously incorrect result may even be better than a sometimes-correct one.) The linked blog post by Paul Ganssle (https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgu...), the author of the `dateutil` (not to be confused with the standard-library `datetime`) library, is also informative. |