|
Sorry, but that analysis is too sloppy to allow any such comparisons. If you look at the scraped document list [1]: * Most of these are not normative! They're not specifications, they're guides, recommendations, terminology explainers, and so on. * A lot of documents are irrelevant to implementing a web browser (XSLT, XPath, RDF, XHTML, ITS, etc.). * A lot are obsolete (e.g. SMIL, OWL). * There are tons of duplicate versions (all of CSS 1-3 are included; multiple versions of HTML, MathML, and of course the irrelevant XML-based standards). * Many standards are scraped both as individual section files, and as a single complete.html file. He didn't notice this, and counted both. As a particularly egregious example, he includes every version of the Web Content Accessibility Guidelines (WCAG) standard, going back to 1999, each of which is large. I have not done any kind of analysis myself (which should be thorough to actually be fair), but if you prune it down to the core technologies (HTML5, CSS, ECMAScript, PNG/GIF/WebP, etc.), I'll wager it's probably less than a million, or at the very least less than 2 million. The ECMAScript spec is just 356,000 words. [1] https://paste.sr.ht/~sircmpwn/475ad10f9ff9f63cd0a03a3f998370... |