| “Parquet has both a date type and the datetime type (both sensibly recorded as integers in UTC).” What does it mean for a date to be utc? my date, but in the utc timezone? Usually when I write a date to a file, I want a naive date; since that’s the domain of the data. 2020-12-12 sales: $500. But splitting that by some other timezone seems to be introducing a mistake. Often I want to think in local naive time too, especially for things like appointments that might change depending on dst or whatever. Converting to utc involves some scary things. Timestamps are also useful but I don’t want to transcode my data sometimes as the natural format is the most correct. |
An example of non-UTC time is TAI, which is International Atomic Time. The difference is that UTC has leap seconds to deal with changes in the rate of rotation of the earth, while TAI marches on without any discontinuities.
So for a date to be “in UTC” really just means it uses the leap seconds published by IERS. This article says “integers in UTC” which is a little ambiguous, but probably means “integer UTC seconds since the Unix Epoch.”