Hacker News new | ask | show | jobs
by sroussey 1986 days ago
I mentioned this on Twitter[1], the reason to encrypt before sending over SSL is not about double encryption, it’s about how a large backend system is designed.

Often, you have load balancers that are SSL endpoints, so the data is decrypted at that point.

You can start to see the problem already. What if there is a different bug, and so a dev starts logging requests somewhere down the line? You accidentally start logging cleartext passwords. Oops. Facebook was fined for this not that long ago.

But if the password is encrypted, then it’s not really an issue, and the black box blob can be forwarded to a login microservice. There, the team decrypting will be on higher alert.

So depending on the structure of various teams, you now have fewer teams that need that kind of security oversight and can move faster.

Smaller blast radius of something goes wrong.

[1] https://twitter.com/sroussey/status/1347688753221931010?s=21

4 comments

I agree this may be why they did it.

However a better approach to this problem is to not rely on shared secrets. Use public key signature tech to stop worrying about mistakenly logging a secret. If you never had it, you can't lose it.

If you were to log literally every byte of the plaintext traffic when I sign into GitHub (e.g. maybe you're a GitHub ops person), you don't get the ability to sign into GitHub as me. There's a WebAuthn signature step, my signature is authentic, and you can even verify that from your log if you want, but you'd need to make a new signature to sign in, and you can't do that because the key needed to make my signature never left my hands.

Even better, GitHub defuses their liability because as well as a (presumably hashed) password that could be broken by a hypothetical attacker they've got a public key for me, and learning that public key doesn't help the attacker do anything, at their site or anywhere else. Even - unlike with the SSH public keys GitHub holds - to identify people, since WebAuthn public keys are deliberately uncorrelated you can't match my GitHub key against a Facebook key for example.

Yeah, I do hashing client side as well as server side. It's not ideal - what would be ideal is zero knowledge proofs for such a thing. But it's basically:

hash(static-pepper, username, password) * 250k

That + tagging the password with "password++" or something means that you're a lot safer against the major issue of leaking a password before it's stored, for example the mistake that definitely happens everywhere of "let me just add request logging, whoops there's everyone's plaintext passwords". You can always search your logs for the 'password++' tag and alert if you find it, and if that does happen at least you know an attacker isn't going to have an easy time extracting a plaintext password - it buys you time.

And if an attacker gets SQLi or whatever and dumps the passwords they're that much harder to crack - you've added hundreds of thousands of iterations of key stretching, and it's totally distributed to clients so you don't even have to worry about it blowing up your db/ auth service CPU.

And it's trivial to implement, which is the really important part. ZKP is a lot more work, but what I described is like 5 extra minutes and pretty trivial.

> hash(static-pepper, username, password) * 250k

Though this is obviously better from a wire-interception PoV, it means that you can't enforce any password policies, or maintain a list of leaked/bad passwords (e.g HIBP)

You can enforce password policy client side. A technical user can go way out of their way to bypass it, but honestly, at that point who cares? If you really want to you can send up some metadata or something I guess.

Another option is to send the password at user creation time, but then to rely on a hash at login time. Now there's leak potential, but it's you just have to audit for leaks in one part of the codebase.

There's a lotta stuff you can do to improve upon the very quick version I'd mentioned.

Why not hash the password instead?
Hashing the password client-side makes a leak of the password hashes (which happens all the time) equivalent to leaking the password itself.

https://en.wikipedia.org/wiki/Pass_the_hash

It's definitely not equivalent. The plaintext isn't (as easily) recoverable, which means that if the user used the same plaintext password for another service it's (somewhat more) protected.

Pass The Hash is also protocol specific - if you try to replay a hash to your average HTTP service it won't go "oh, it's already hashed, thanks" it'll just hash it again and you'll fail to authenticate.

Then just add a time sensitive seed to it? I don't think it is equivalent to leaking plaintext. It can't be used to guess passwords on other websites.

If your SSL layer is compromised, you can't trust the client-side encryption. The attacker can send arbitrary javascript.

> Then just add a time sensitive seed to it? I don't think it is equivalent to leaking plaintext. It can't be used to guess passwords on other websites.

You're reinventing password hashing and salting. Further, there's no guarantee that the has cannot be used to guess passwords on other sites. For what benefit, exactly? Your hash is now the password, and basically as dangerous as it was in a more conventional arrangement.

Pass-the-hash is a real kind of vulnerability that has been used to exploit real systems. We might be better off sticking with design approaches that don't have this problem instead of trying to fix out way out of the problem.

> If your SSL layer is compromised, you can't trust the client-side encryption. The attacker can send arbitrary javascript.

Are you sure this is what it's guarding against? A sophisticated application architecture might involve a load balancer decrypting and doing the initial routing, several sets of data handoffs, and then the application that needs it handling the password. Any one of them could mishandle or leak the password, but only the one at the end actually needs it in the clear.

How exactly is asking for my password to be hashed "reinventing password hashing and salting"? Seems like the opposite, no?

If your password is properly salted, it can't be used to guess passwords on other sites, that's the whole point of salt and hash.

The fact that RSA is being used means that your plain-text password is going to appear on their servers. Maybe it won't get cracked in the SSL layer, but it is still there.

> Are you sure this is what it's guarding against? A sophisticated application architecture might involve a load balancer decrypting and doing the initial routing, several sets of data handoffs, and then the application that needs it handling the password. Any one of them could mishandle or leak the password, but only the one at the end actually needs it in the clear.

Do you realize that if an adversary even only has read access to the SSL layer, they can just copy the cookie and steal the account that way?

> How exactly is asking for my password to be hashed "reinventing password hashing and salting"? Seems like the opposite, no?

You've already started to add new things, like a TOTP-ish element, to stymie replays. Then the server has to check what it's been fed, having stored neither the original password nor the hash of the password it's been passed. It cannot be allowed to have the hash because the has is now the password. It need something safe-ish to store that the input can be computed on to make comparisons possible.

Now you have all the problems of server-side hashing and comparison coupled with extra client-side hoops.

Again, what have you gained?

> Do you realize that if an adversary even only has read access to the SSL layer, they can just copy the cookie and steal the account that way?

You are absolutely correct. That is completely accurate in every single possible way.

Do you think that perhaps there might be other reasons to consider here? Such as debugging, logging systems, and so on? Perhaps there are design goals beyond blocking direct attacks. On an average day, most of these systems will be more likely to be accessed and used by authorized administrators than by external adversaries, after all. Many security incidents arise not out of malice, but out of tools behaving dangerously. I know I've dealt with sensitive material leaking into logs.

I hope I have made myself clearer. I can see I failed to communicate effectively previously. Please, don't hesitate to say so if I have failed either there or in understanding your points.

You could. That's how something like CHAP works.

You'd actually end up hashing it twice. Once using the salt to go from plaintext to what the sever has stored and then again using the challenge.

It has problems though. The strength of your password hashing would be limited by what the weakest client could do, rather than what the server could do. Asymmetric encryption ends up being simpler.

So what you’re saying is if you know what you’re doing and you don’t log any clear text passwords it’s fine to not double encrypt?
It's extremely naive to think that bugs and mistakes only happen to those who "don't know what they're doing".
At some point we must simply “know what we are doing” and do something. Software would be a tedious affair if you have to approach everything from the perspective that maybe you don’t really know what you’re doing and need to implement some way to verify.
What are unit tests, linters, SAST tools, and more all for? Naively, it seems to me that we swim in a great sea of tools to verify that we do, in fact, know what we're doing.
How do you know your tests are correctly testing what you need to test? Tests all the way down?
An excellent question! One reflecting true wisdom.

The answer is that you never have true certainty. What you get is smaller error bars with each set of tools and tests until you find the error bars and effort involved both acceptable.

The other comment mentions tests, but I'd like to that to that. The reason why it's impossible for anyone, even the single greatest programmer on the planet, to not create bugs, is due to the high complexity of the software we write. It's not so much that you'll make a mistake when creating a new program. It's that over time, as you make more and more changes, it's impossible to actually trace out the exponential complexity that arises from your changes.

The purpose of unit tests is to catch these, and yes as you mention lower down even those aren't 100% infallible but they greatly help. That's why even with some of the greatest programmers and extensive testing, we still constantly see major bugs from every single top tech company. I don't think there exists a single completely bug-free software of non-negligible size out there.

If you are immune from mistakes, sure.

Making decisions as if you are immune to mistakes is a good way to ensure you prove yourself wrong in the worst way.

Making decisions to prevent things which should be impossible is how you build reliability.

And, adding assertions / very loud logs is a good way to be sure your assumptions hold: e.g. causing your tests to fail if the test user's password (or other privileged info, etc) appears in any logs is a fairly good extra safety layer.
That’s a great test idea I have not used myself, thanks!

I’m also working on forks of logging systems to make them smart about what they log in the first place, but not quite like the Palantir does it.