| There aren't many better ways to ID people who've downloaded music illegally than to have them upload copies of those music files to a central location where they can be easily analyzed. Hell, I'd love a chance to run stats on that dataset and I don't even have any skin in the game. Some things you could look for: - Files with the names of known release crews hidden away in an ID3V1 comment somewhere; - Files uploaded by multiple people with identical incorrect (misspelled arists/titles) or unique (rip logs, comments, ratings) metadata; - Files with timestamps earlier than the title's official street date (assuming the dedicated upload client preserves those dates); - Multiple uploads of files with bit-for-bit identical audio content that doesn't correspond exactly to any official digital release (like identical MP3 encodes of a non-perfect CD rip, or identical versions of anything sourced from vinyl or cassette -- like a lot of the stuff on boutigue MP3 blogs); - Files with audio data identical to anything ever subject to an official takedown notice, or downloaded by the RIAA from BitTorrent, Usenet or a Megaupload-type file sharing site. Any given release will only have been distributed in a handful of legitimate digital versions -- basically the CD release(s) and any licensed digital download services. Any file uploaded to an online music locker that doesn't match those legit sources will be suspect, and any of those files what are uploaded to Google in bit-for-bit identical versions by multiple people will be a huge red STOLEN flag. I have very little illegally-downloaded music relative to the size of my entire collection, but I still wouldn't go anywhere near a service like this. The data extractable from the audio files people upload would significantly reduce the effort needed for the record companies to go after even small-time downloaders. That's never really been feasible before. |