The months-start-at-0 is a legacy of POSIX's datetime API, as is the year field being a count from 1900. Java continued these mistakes, and then added the everything-defaults-to-local-tz functionality (POSIX had different functions for UTC versus local tz), and JS copied Java's API in this respect without modification.
Java eventually torched its date/time API not once, but twice: the JDK 1.1 addition of Calendar that deprecated most of the naïve methods on Date, and JDK 8 adding java.time.*, which is roughly the modern model of date/time APIs.
ETA: Ahaha, I managed to read "days" as "weekdays". The following is now. permanent marker of my reading incomprehension.
IIRC (and it's been a while, so I may well not RC), specifies that 1 is Monday, 2 is Tuesday, ..., 6 is Saturday and both 0 and 7 are Sunday. And the reason that looks weird is that there is (IIUC) a genuine difference between western nations as to if a week starts o Sunday or Monday.
Which, you know, sucks. But, also makes it somewhat easy to make a C array that you can index with a (.tm_wday%7) to easily get the day of the week.
Which may also be the reason why months start at 0.
I think months from 0 in POSIX were intentional, not mistakes. The things you might want to do with an integer month number include:
1. Look up some attribute or property of the month in a table (number of days in the month in a non-leap year, name of the month, list of fixed holidays in the month, etc).
2. Determine if the month is before or after a different month.
3. Determine how many months one month is after or before another month.
4. Determine which month is N months before/after a given month.
5. Print the month number for a human.
In languages that use 0-based indexing:
#1 is more convenient with 0-based months.
#2-4 only need that the month numbers form an arithmetic progression, so are indifferent to where we start counting.
#5 is more convenient with 1-based months.
So it comes down to is it better to have your low level date functions require that you have to remember to subtract 1 whenever you do something from #1, or to add 1 whenever you do something from #5?
I'd expect most programs do a lot more #1 than #5, so the code is going to be cleaner if you go with 0-based months.
In Javascript's Date API, the month number is 0-based (January is 0), but the day number is 1-based (the first day of January is 1, not 0). That inconsistency is unexpected and makes it harder to use.
I'm glad most modern date APIs use 1-based numbers to represent the month and the day (not to mention the year from 1 AD onwards).
> 1. Look up some attribute or property of the month in a table (number of days in the month in a non-leap year, name of the month, list of fixed holidays in the month, etc).
The first two cases basically exist internally within the date-time library, so it's not a particularly common example for end users to be doing. It's also not particularly hard to do #1 with 1-based months, because you can either subtract one before indexing, or you can just have entry 0 be a null-ish value.
The thing is, month names already have an implicit numerical mapping: ask anyone, even most programmers, what number to assign to the month of January, and you're likely to get 1. So an integer representing a month name is going to be presumed to be 1-based unless documented otherwise, and having it return 0 instead is going to confuse more programmers than not.
In other words, the trade-off isn't "is the code going to be cleaner for this case or that case" but rather "do we make an API that is going to confuse everyone's first impression of it, or do we make the code marginally more complex in one specific scenario?"
> The thing is, month names already have an implicit numerical mapping: ask anyone, even most programmers, what number to assign to the month of January, and you're likely to get 1.
Couldn't you make a similar argument then that when representing English text, 'A' should be 1? That's the number most people would give when asked to assign numbers to the alphabet. Not many people are going to say 65 which is 'A' in ASCII and Unicode.
Programs are not people.
I generally prefer, and believe it leads to less bugs, to represent things in programs in the ways that best fit the operations that the program will do doing on them, translating between that representation and external representations upon input and output.
> Couldn't you make a similar argument then that when representing English text, 'A' should be 1?
It actually is!
>>> hex(ord('A'))
'0x41'
>>> hex(ord('a'))
'0x61'
The five-bit intuitive numerical mapping of the letter is prefixed with the bits 10 (for uppercase) or 11 (for lowercase). A similar thing happens with digits, where the four-bit intuitive numerical mapping of the digit is prefixed with the bits 011.
For letters, this leads to a "hole" at 0x40 and 0x60. Instead of making the letters zero-based (that is, 'A' being 0x40 and 'a' being 0x60), they decided to keep the intuitive mapping (which starts at 1), and fill the "hole" with a symbol.
#5 does not seem like something you should be writing anyway, there ought to be a standard library mechanism for that. So even if you do that often you might not need to care about it.
I think it's basically tradition at this point that languages make a hash of their date/time abstractions, maybe getting them right after a half dozen iterations.
[EDIT] Realised this is the behavior I would expect. Parse is doing what I would expect it to, taking into account the given time being UTC. It's then returning an object that's in the local timezone. Still goes to show just how confusing this datetime stuff can be when even the expected behavior looks wrong.
Wait… how is this not a bug according to both Microsoft's own spec[0] and ISO 8601[1]? The Z specifically means this time is in UTC.
The behavior is not at all what I would expect from reading the docs [0]:
> A string that includes time zone information and conforms to ISO 8601. In the following examples, the first string designates Coordinated Universal Time (UTC), and the second designates the time in a time zone that's seven hours earlier than UTC:
> "2008-11-01T19:35:00.0000000Z"
> "2008-11-01T19:35:00.0000000-07:00"
[EDIT] I get it now: it's parsing it right, it's just that it's then putting it into a datetime object that's in the local timezone – which is what I would expect. The alternative would be counterintuitive to me.
That is describing the different formats it will accept.
Further down the page:
> If s contains time zone information, this method returns a DateTime value whose Kind property is DateTimeKind.Local and converts the date and time in s to local time. Otherwise, it performs no time zone conversion and returns a DateTime value whose Kind property is DateTimeKind.Unspecified.
Yeah, I confused myself and put an edit in there. I first thought it was just ignoring the Z then giving out a UTC object. Now I realize it was doing what I expected it to: parse correctly and hand out a local time.
Of course this is avoided by using getFullYear(), but I've always wondered why a language that came out in 1995 had a function that returned two-digit years.
> - Converting from string datetime to a datetime object will automatically convert the time to the local tz
I think this makes sense in the context of client side javascript whose sole point is to present UI to the user. The dates aren't necessarily converted to local TZ (the date will always remain in the same and fixed point in time), but rather any output is local to the timezone.
I think this is more good than bad IMHO, given the intention of JS is to build (or augment) UIs. It's just not great that it's hard to do anything other than that.
Java eventually torched its date/time API not once, but twice: the JDK 1.1 addition of Calendar that deprecated most of the naïve methods on Date, and JDK 8 adding java.time.*, which is roughly the modern model of date/time APIs.