Hacker News new | ask | show | jobs
by aaronmdjones 1800 days ago
> At the time, more than 16,000 employees had access to users’ private data, according to the book.

> Stamos suggested tightening access to fewer than 5,000 employees and fewer than 100 for particularly sensitive information like passwords.

I'm sorry, what?

I can tell you the number of legitimate engineers that should have access to user's passwords.

It's a nice, round number.

It's zero.

9 comments

"like passwords" probably comes from the journalist and doesn't actually mean that anyone has access to passwords.
Passwords wouldn't be the most sensitive data Facebook holds for most people - private photos sent via Messenger are probably way more sensitive.
It’s not zero. Hashed passwords are still passwords and should be treated as such. “Zero” implies that hashed passwords are not passwords, since otherwise you won’t get to zero.

Just because passwords are hashed doesn’t mean you can give access to them willy nilly and happily claim that “zero” people have password access.

> Hashed passwords are still passwords and should be treated as such.

Agreed.

> “Zero” implies that hashed passwords are not passwords, since otherwise you won’t get to zero.

You can get to zero:

- No humans in the serving path servers' ACLs.

- Diagnostic/recovery servers for humans which require the person submit a justification that links to a ticket/bug/outage, wait for a second person to approve, perform high-level operations that affects sensitive data ("restore user from backup at timestamp T") rather than exposing direct access ("read from backup", "write live user"), and keep an audit record for later.

Everything is about trade-offs. This approach takes more engineering time to set up and if not done well can really slow down common tasks. And there are certainly reasons there might be exceptions—eg allowing the primary on-call to have unilateral access can speed recovery over waiting for a second person to be available. But zero is possible, and stories like this remind us of its value.

A long time ago I managed FB's udb backups, the central MySQL schema around which all other services were strung. Even from the beginning FB never stored plaintext passwords. Can't say there weren't log excursions or the like, but when found these would have been critical bugs and fixed immediately.
They probably mean some access to production servers where it might be scanned in memory or otherwise grabbed using debug tooling. Making this possible for 0 people will be challenging and at the least add a lot of complexity.
Perhaps they mean access to reset a user password.

I have a hard time believing that Facebook would store user passwords without at least hash + salt which makes it virtually unrecoverable.

Is it not possible to only have the hashes or does it have to get persisted somewhere in the process?
Usually it's in the logs. So small number of SREs can sometimes access them (if there are logged). And even if they are not logged, they can always show up during tcpdump debugging of network issues and such.

Client side hashing could solve this, but almost no one does it.

> Usually it's in the logs.

That is definitely not usual

Facebook reported that they logged passwords in plaintext by accident a couple of years ago.

https://about.fb.com/news/2019/03/keeping-passwords-secure/

It's pretty common; a lot of places have blanket logging and it hasn't occurred to them to disable it for login attempts. It is obviously undesirable.
Not sure what you mean.

By default nether Apache nor Nginx log any post data. So with the 2 most popular options you actually have to go out of your way to enable this.

On the application side I mostly know Rails and it redacts even password hashes.

> Client side hashing could solve this, but almost no one does it.

That is because it doesn't solve the problem. If you implement client-side hashing and then let your engineers see what arrives from the client, they will be just as able to log in as the snooped user; the client-side hash has added nothing.

What a client-side password hash does is to conceal the user's password from the user. There are a few types of password-related attacks:

1. If I learn your password, I can log in as you.

2. If I learn your password, I can log in as you, on other sites where you use the same password (and I can find your account, perhaps because you also use the same username or email address).

3. If I learn your hashed password, I can try to crack it.

4. If I dump the site's hashed password database, I can learn who else has the same password as you. (Because the hashes are the same.) Cracking one of them will crack everyone else's -- efficiency!

I might learn your password in a few different ways. Maybe it's "ncc1701d". Maybe I snooped your network traffic. Maybe I found the hash and cracked it.

So what are the solutions?

#1. There is no solution; this is working as intended. That said, HTTPS operates in this area, by making it more difficult for people to learn your passwords by snooping your network traffic.

#2. You, the user, would have to use different passwords for different accounts.

#3. I, the website operator, should process your password with a hashing function that is tuned to be slow.

#4. I, the website operator, should salt the passwords so that two identical passwords on two different accounts on my site produce different salted hashes.

Hashing the password on the client side and sending the hash over the wire does not address attacks 1, 3, or 4. They all work just as well regardless. It might show up if I try to perform attack #2.

This attack involves me (1) learning your password on one site, and then (2) using it to log in to your account on a second site. We will compare strategy (A) -- you enter your password, the site hashes it on your side, an intermediate hash is transmitted to the server, the server hashes it again, and finally the server stores the final hash in a database -- with strategy (B): you enter your password, the site transmits it in plain text to the server, the server hashes it, and finally the server stores the hash in a database.

In strategy (A), you have three passwords, plaintext, intermediate, and final; for purposes of parallelism, the same is true in strategy (B), except that your intermediate password is identical with your plaintext password.

In the case where I learn your final hashed password by dumping the site's database, there is no difference. In both cases, I will attempt to crack that password by guessing a likely password and running it through the hashing process.

In the case where I learn your plaintext password, there is also no difference. That's just attack #1.

In the case where I learn your intermediate password, perhaps by snooping network traffic or server logs, I can immediately perform attack #1 against your account on the particular site, because -- for purposes of that site only -- I have learned your plaintext password. I can also perform attack #3, cracking your hashed password, against your accounts on any and all other websites. But, in strategy (A), I cannot immediately perform attack #2 -- my attempt to crack your password would have to succeed first. If you have a weak password, it will. If you have a strong password, it probably won't.

This is very weak tea, which is why, working as a security consultant, we considered the defense you describe, "passing the hash", to be a red flag, and always recommended against it.

In the case being discussed here, where Facebook is explicitly worried about engineers viewing users' Facebook accounts, as opposed to their eBay accounts, client-side hashing has literally nothing to add.

Only the hashed versions should ever be stored. Like the OP said there is Zero reason to store plaintext user passwords.
IMO, it's likely that the article was confusing "passwords" with "hashes" here. No company the size of Facebook is going to be storing plain-text passwords in 2021.
"No company the size of Facebook is going to be storing plain-text passwords in 2021 on purpose."

We've seen stories where the servers were logging the plain text data it received where those logs persisted for some time, but the plain text was never "stored" in the database.

Related anecdote, when I was in university, I had changed my university IT services password to something "offensive" (had the F word in it) after getting frustrated trying to find one that met the novelty and entropy requirements. I was contacted later by IT to tell me that was an inappropriate password and to change it. I found it much more offensive to know that IT could see my password in plain text, than I would to read a swear word.
The password probably was not stored in plaintext (if you've been to University in the last thirty years), but IT staff might have periodically ran a password-cracking tool in order to find weak passwords (and swear words in various languages will certainly be in their dictionary). They alert the user and request the password to be changed (might disable the weak one) in order to safeguard their network.
This is an interesting point, and I did consider it as I was typing the comment. If I remember correctly, the password was fuckStateU+1 with my university name (abbreviation) subbed in (like I said, I was getting angry trying to meet the special character etc requirements). Do you think password cracking software they use would break suck a password any faster than brute force? I suppose its possible but I'd discounted it.

This was about 15-16 years ago I think.

There are all kinds of places storing things that aren't explicitly stored. For example, a VM snapshot of a system that was processing a request including a password and had it in the memory at the time, any service (including something that does just network proxying) with a debug mode that logs full request data, etc.
Could it be possible they mean access to servers where authentication is handled? If you had root access to such a server you could look at memory or packets and work towards revealing a user's password.
Is the password sent to Facebook or does it never leave the client? (Genuine question - I have no idea how modern web apps do authentication)
In the case of a username/email + password login, the password is sent to Facebook. Code on their end uses a one-way encryption to turn it into a unique value that's compared against a value they stored in the database when you set your password. As long as the one-way encryption is done with the same values, the two passwords match, and FB knows that you've typed the correct password without them having to store your plain-text password in their database

Some insecure websites (not Facebook) may not do this, and instead store your credentials in a database without encryption. It's a terrible idea, and GP's comment seemed to be referring to this when they (correctly) suggested that no one should have access to a database of plain-text passwords.

The replies mostly refer to the fact that even if the password never hits FB's database, there is still code running on authentication servers that handles that password in plain text before it's been encrypted. Limiting engineer access to authentication servers is a good idea, but it'd be challenging to prevent ALL engineers from having access.

It is sent over the wire.
Or one step further, the passwords are hashed and salted using a encoding spec like BCrypt.

Any employee logins are done through skeleton keys that are audited.