It's not only dealing with password length. You could DoS the server by tying up all threads in processing the upload. Uploading 4GB isn't instantaneous.
For the application code to complain about password length then the 4gb upload has to have already occurred.
Protecting against a DoS attack this way is done in the webserver, which doesn't care about individual fields. Sure this means there's an implicit password length limit, but not in any application-level sense.
I'm not arguing with that. I wholly agree with you that a client should not be able to upload 4 GB for the password field. I think I misread your first point as if it dealt with server-side storage.
No; this is effectively the same as doing no hashing at all. If your database gets stolen, people can replay the "hashed" passwords from it to the server, without having to hash them themselves.
I didn't mean to imply that you'd just store the hash the client comes up with. That's idiotic, of course. Not everyone uses SSL, even though they should, and it's not always secure, and even with the use of SSL, it seems that there would be a potential length attack that could be employed to effectively guess a user's password length. So in all cases, IMO, it makes more sense to be receiving a fixed-length thing that is fairly insensitive to attack in itself. So perhaps a user has 2 salts associated with their account, per password: an auth salt and a storage salt. Then an auth looks something like this:
Now you have quite a restricted domain, so if your database is compromised there are a lot fewer values an attacker needs to enumerate to try and crack the password.
That reasoning doesn't really sit well with me as a mathematician.
I would expect the image of MD5 on its codomain is almost a bijection (would love to be demonstrated wrong here, if anyone knows of a paper that studies this, but it seems reasonable that any good hash would have this property).
This doesn't really protect against a dictionary attack, but to guarantee a collision, an attacker would still need to try approximately 2^128 passwords for each password in the database, which is already the worst case for the attacker, so no strength is really lost.
Here is the attack scenario: Alice is trying to login to server Bob, but her WiFi access point Eve is running a protocol-mismatch attack: Alice communicates to Eve by HTTP, which Eve records and then transmits to Bob by HTTPS. We'll assume that Eve does not inject her own scripts but just eavesdrops on the conversation.
Your problem: clientHash is "password-like" and Eve can use it to log in.
One solution: use the HTTPS auth methods. Unfortunately, this is not the standard in the industry because the GUI is ugly.
The reason why this is an even bigger problem than you might think: what does the login method do? It probably sets a "remember me!" cookie, which is also "password-like" for just about every purpose. Eve sniffs the cookie and can use it to compromise the account in the future.
Also the 'remember me' cookie is probably stored in the clear in the database. So much for that, eh?
What you really want is a system based on digital signatures: whenever I make a request I digitally sign it. But the core problem is that I don't want to type in my password for every damn request, so this is being cached between requests and between page views. Depending on how it's cached, there are vulnerabiities -- especially if you just have your browser automatically fill in the password blanks, but also if you carry these "password-like" login cookies. A system of delegation and revocation can be done, but you come dangerously close to reinventing a public key infrastructure if you go too far down this road.
HTTPS auth methods can help out quite a bit, but can be hard to set up with a custom backend and hard to test against that sort of "downgrade attack" I mentioned in my first paragraph. The best of these ideas is to give your users client certificates and thus be safe forever, but nobody has found it useful enough to actually do this.
The only real alternative has been to remind the customers to always look for the s in the "https:// in the location bar. This is stupid and now there are some proposals for browsers to ship with lists which are "Expected to be secure", to avoid it, but yeah, we never really outgrew it.
So sitting here trying to think about a way to solve that problem, what I come up with is essentially PKI. But if the attacker has the ability to inject code, they can always break this entirely by stealing my private key.
This feels to me like URLs are fundamentally broken, in that a user might try to go to http://mybank.com. Is there any secure way to get their browser to redirect to https://mybank.com? It seems like there might be something that could be done with dnssec, but that feels brittle too. Gross.
I'm not sure about the added security benefit. Basically the only additional security is if someone gains access to your server they can't capture plain text passwords anymore but once someone gains access all bets are off and they might as well just switch out the Javascript. In any case you should run a strong hashing algorithm like bcrypt with a salt and oh I don't know 1000 iterations? Also last time I tried something like that (which was a few years ago) I ran into big problems with different hashing algorithm implementations providing different results (JS.md5("password") != Python.md5("password")).
The added security benefit of server-side hashing is the same as if plain text passwords are sent, to prevent knowledge of the authentication secret if the database contents are disclosed to malicious third parties. The client side hash of the password is only to ensure that a fixed length secret is sent and subsequently processed, to avoid DoS attacks on the server.
Protecting against a DoS attack this way is done in the webserver, which doesn't care about individual fields. Sure this means there's an implicit password length limit, but not in any application-level sense.