Haven't looked at the code that generates this list (if available), but that sure looks to me like double-counting going on here. Most files in \Windows\WinSXS are hardlinks.
What's the story on that footprint, anyway? We went from (from memory) ~250-350MB Win98 to ~700-800MB XP to ~10-15GB(!?!?!) Win7, and just up from there. Plus the default settings seemed to starting going really crazy with swapspace/caching around the time of Win7. Another 10+GB if you didn't tell it to knock that crap off. Why the sudden, giant shift? They didn't add 10-15x the features, that's for sure.
WinSxS (Windows Side By Side) assemblies were introduced to avoid dll hell by allowing Windows to store multiple versions of installed dll's. So even a minor security patch may leave the former version around because other apps may use/expect it. I think that might add some bloat over time? Also Windows Update installer caches. A ton of Windows updates actually leave their installers around in case you want to uninstall them. That can add up! I've seen it easily get to 1-2 GB.
They did, to some extent. Plug in almost any piece of standard consumer hardware and it'll probably mostly just work without a network connection. All those drivers don't take up zero space, but the benefit when my mom plugs in a printer and it just works makes it worth it.
I think at least some of it is the Windows on Windows stuff to allow 64 bit machines to run both 32 and 64 bit software. Weren't the 32 but versions of Win7 about half the size of the 64 bit ones?
There's still a lot of size growth over time, of course.
To this point, the files in the install are compressed, and I'm sure XML metadata is the sort of thing that compresses well with the DEFLATE algorithm they likely use.
I discovered this a few months ago, when I went looking for XMP metadata in the filesystem and used the magic number trick to extract it from files of all kinds.
I found it is common to find XMP inside media files embedded inside Windows EXE, as well as Linux binaries, JAR, Microsoft Word and other composite formats.
Complex media objects frequently use an encapsulation system such as ZIP. When a PNG file is incorporated into a JAR or a Word Document, the XMP content in the file may not be compressed because the archiver may not attempt to compress the png file since it assumes the data is already compressed.
XMP is very good from the viewpoint of content creators in terms of having comprehensive metadata incorporated into files so that it does not get out of sync. XMP data is RDF data using an improved version of Dublin Core, IPCC and other industry RDF vocabulary. You can write SPARQL queries right away, plus XMP specifies a way to make an XMP packet based on pre-existing metadata in common industry schemes.
The XMP packets can get big, and you sometimes see people make a tiny GIF image (say a transparent pixel GIF) that is bulked up 100x because of bulky metadata. Once you package data for delivery to consumers you want to strip all that stuff out.
> When a PNG file is incorporated into a JAR or a Word Document, the XMP content in the file may not be compressed because the archiver may not attempt to compress the png file since it assumes the data is already compressed.
PNG can apply DEFLATE to blocks though, right? Does XMP not use it?
Deflating can be applied to some chunks, but not at will. The zTXt chunk can be compressed while for example the tEXt chunk cannot. The newer iTXt chunk can vary.
The two former are limited in scope and language encoding support, so iTXt is typically used for extended textual data such as XML/XMP etc. But if is saved compressed or not depends on the PNG encoder/host used (there can also be multiple instances of these chunks in the same file).
Photoshop for instance saves uncompressed, I guess to give fast access for performance reasons (ie. file viewers using galleries for numerous images while displaying their meta-data).
Bear in mind, that was the Vista days, and Windows 10 now supports even more devices. 800MB of drivers at the time.
I would not be surprised if Windows supported by default upwards of 10000 drivers. It works pretty much flawlessly on even somewhat obscure and old hardware. And when your OS is installed on that many consumer devices, and not informally standardized servers, you are going to meet those weird devices one way or the other.
Windows drivers may also take up a bit more space individually because of the overhead caused by either the Windows Driver Model or Windows Driver Framework, but that's the price to pay to not have a driver crashing and bringing down your entire system. Yes, Linux, I'm looking at you.
https://gist.githubusercontent.com/riverar/f4a56b91580af1bd3...