Hacker News new | ask | show | jobs
by meowface 4154 days ago
>Anyone who would use this data maliciously probably already has it.

You might be surprised. The fact that these dumps are supposedly quite old certainly mitigates the risk, but I've seen cases of primary email accounts being taken over from a plaintext password in a dump 5+ years old. No one ever tried it on the email because it wasn't in the dump and wasn't identical to the username, though it was very close.

Aggregators like haveibeenpwned.com and Lastpass responsibly use the passwords they scrape, they don't release them all in a big batch like this. Many cybercriminals do the same kind of scraping and share these aggregated lists privately, but they're always going to be missing things, so there's no question they're all going to be pulling in your list, too. And odds are there's going to be at least one dump that a lot of them missed which yours has.

I do understand there is some research benefit here, but even in the best possible scenario I don't think the value from the research outweighs the costs.

1 comments

First of all, a good number of these passwords were simply gathered through google. Some were gathered via the archive.org archive of pastebin pastes and their normal web page archive. Some were from forums that were located via google. This data is already out there, being aggregated doesn't make it any easier to hack these people.

Try searching for "Cucum01:Ber02" or "shawman:badman" and you will see how many passwords are indexed. I have hundreds of searches like these that I monitor and scrape.

Second, I regularly share my data with the owners of password checking sites such as haveibeenpwned to make sure users are able to be aware of these breaches. Releasing this data isn't something I have taken lightly, I debated it for years. I have weighed the risks and felt it was important to release the raw data, although not everyone will agree with me on this. I made a good effort to minimize the risks to actual users.

Finally, keep in mind that most users are already at risk simply because they have bad passwords. Ten percent of users have a password on the top 1000 list. A large percentage of users are at risk because the websites they are on don't have proper security. This is how people get hacked, not because of a password found on this list.

Still, the whole purpose of a password is to remain secret. He's certainly doing these users a disservice by releasing this list regardless of the hypothetical likelihood of the data already being available. Basically the arguments for doing this all seem to boil down to "they should already know their passwords are compromised" which nobody can guarantee is the case.

I agree that having a crappy password puts you at risk, but what about the people who genuinely tried to use some common sense but are on this list anyway? Is it their fault for not religiously keeping up with the latest indexed password lists?