I've observed that lines of code are measured differently based on whether the writer is trying to convince the audience that the subject matter is big and complicated and the reader should respect the magnitude of dealing with this particular piece of software OR whether the author wants you to appreciate the brevity/simplicity/approachability of the software in question. The first decision made in this decision tree is whether you just use wc , or whether you filter out empty lines. Next goes the comments. Next goes syntactically less significant lines (just a closing brace that could* go on the previous line). Wash rinse repeat.
It's a variant of the "I didn't have time to write you a short ____ so I wrote a long one instead" adage.
I would guess (but only guess) that this article erred on the side overstating size.
Why? If you have 100+ engineers at any given time, shipping features over a period of a few years, you'll hit 1M in no time.
It sounds like a lot, but it really isn't when you consider the amount of people working on it.
Now whether or not you can build the same thing with less LoC, probably. But it's not like it was built from the ground up with every piece of functionality planned out from day 1, so there will be inefficiencies.
Comparing it to Linux is pointless. Platforms should be relatively stable, products are ever changing and the shelf-life of the code is sometimes measured in weeks/months.
Cannot fathom why LOC is a metric? Me neither. Lots of stuff has millions of lines of code in various languages with wide ranging feature-sets and functionality. LOC has near zero meaning across the language/project boundary.
It's a variant of the "I didn't have time to write you a short ____ so I wrote a long one instead" adage.
I would guess (but only guess) that this article erred on the side overstating size.