| Different distros require different metadata for their packages. But in general amongst the top 10 distros, almost all metadata in a package is optional. You don't have to specify the URL where it came from or who the author was in order to get a package accepted. Some distros do require a URL, mainly because they build from source to install on a system. But other distros merely accept a source-package which bundles the source code and that's built on their build servers and released after being peer-reviewed. But the peer-review is a manual process, so it's human-fallible. As an example, let's compare the way two distros (Fedora and Debian) package an old piece of software: aumix. Taking a look at this spec file [1] for fedora, we see two pieces of metadata: a URL to a homepage, and a URL to the software. The URL is not used for packaging at all; it's merely a reference. The URL to the file can be used to download the software, but if the file is found locally, it is not downloaded. And guess what? That source file is provided locally along with the other source files and patches in a source package. So whatever source file we have is what we're building. This file doesn't contain a reference to any hashes of the source code, but the sources file [2] in Fedora's repo does. With Debian we have a control file [3] that defines most of the metadata for the package. Here you'll find a homepage link, which again isn't used for builds. The path to a download is contained in a 'watch' file [4], which is again not referenced if source is provided, and generally only used to find updated versions of the software. There are no checksums anywhere of the source used. The source to aumix actually provides its own packaging file [5], provided by the authors. Apparently the URL used here is an FTP mirror, not the HTTP mirror provided by the earlier packagers. Could that be intentional or a mistake? And could they possibly be providing different source code, especially considering the hosts themselves are different? It's clear that there's a lack of any defined standard of securely downloading the source used in packages, much less a way of determining if the original author's source checksum is the same as the packager's source checksum. There are several points where the source could be modified and nobody would know about it, before the distro signs it as 'official'. [1] http://pkgs.fedoraproject.org/cgit/aumix.git/tree/aumix.spec... [2] http://pkgs.fedoraproject.org/cgit/aumix.git/tree/sources?h=... [3] http://anonscm.debian.org/viewvc/collab-maint/deb-maint/aumi... [4] http://anonscm.debian.org/viewvc/collab-maint/deb-maint/aumi... [5] http://sources.debian.net/src/aumix/2.9.1-2/packaging/aumix.... |
But this is still secondary to the basic point: as a Linux user, I get packages from my distro, not from the upstream source, so I don't have to go searching around the Internet for packages or package updates, wondering whether I've got the right source, wondering why there isn't an https URL for it, etc., which is what Windows users have to do according to the article (and OS X users too, for the most part, though the article doesn't talk about that). The distro does all that, and either I trust them to do it or I don't (in which case I go find another distro). The fact that the distro doesn't make it easy for me, the user, to see how they verify the sources they use, does not mean they aren't verifying the sources they use.
Also, while it's true that the distro verification process is human-fallible, as you say, and it would be nice if every OSS project made it easy for distros to automate the process instead, it's still a lot less human fallible than having every single user go searching around the Internet for software. Distro packagers at least have some understanding of what they're doing, and they at least know who the authoritative source for a particular package is supposed to be without having to depend on Google's page rank algorithm.