Hacker News new | ask | show | jobs
by willifred 5513 days ago
From the Lifehacker review:

> It's also worth noting that they're looking to crack down on piracy, so depending on how well it works and how much of your music is illegal, that could be a deal killer for some.

This would be a deal killer for me and nearly everyone I know.

3 comments

There aren't many better ways to ID people who've downloaded music illegally than to have them upload copies of those music files to a central location where they can be easily analyzed. Hell, I'd love a chance to run stats on that dataset and I don't even have any skin in the game. Some things you could look for:

- Files with the names of known release crews hidden away in an ID3V1 comment somewhere;

- Files uploaded by multiple people with identical incorrect (misspelled arists/titles) or unique (rip logs, comments, ratings) metadata;

- Files with timestamps earlier than the title's official street date (assuming the dedicated upload client preserves those dates);

- Multiple uploads of files with bit-for-bit identical audio content that doesn't correspond exactly to any official digital release (like identical MP3 encodes of a non-perfect CD rip, or identical versions of anything sourced from vinyl or cassette -- like a lot of the stuff on boutigue MP3 blogs);

- Files with audio data identical to anything ever subject to an official takedown notice, or downloaded by the RIAA from BitTorrent, Usenet or a Megaupload-type file sharing site.

Any given release will only have been distributed in a handful of legitimate digital versions -- basically the CD release(s) and any licensed digital download services. Any file uploaded to an online music locker that doesn't match those legit sources will be suspect, and any of those files what are uploaded to Google in bit-for-bit identical versions by multiple people will be a huge red STOLEN flag.

I have very little illegally-downloaded music relative to the size of my entire collection, but I still wouldn't go anywhere near a service like this. The data extractable from the audio files people upload would significantly reduce the effort needed for the record companies to go after even small-time downloaders. That's never really been feasible before.

Sure sounds like something a company that's still trying to make deals with record labels might say, but not necessarily do.
How can Google possibly know whether the files you upload were bought legally or not?
I would think the music industry isn't past seeding P2P networks with watermarked files and then suing Google to get a court order to scan people's collections en masse.
They can't tell if your copy is legal or not, but if your copy of an album hashes to the same as the copy found via TPB, they'd consider it suspect.

I think there's enough variation in CD ripping software and various encoder profiles that different rips of the same album using different software is not bit-for-bit the same. That would also be trivially defeatable, so I'm not sure what it'd get Google to do that.

hashes, ID3 tags, and distribution patterns?