Anyway, as best as I could tell, the consensus on the public mailing list was that the best case for a singular winner would be an amalgamation of four finalists: Argon2, Lyra2, Catena, and yescrypt. Each of them has some properties that are desirable. I'm curious to what extent Argon2 will be modified - and especially curious if the final spec will have tunable parameters / multiple modes or be a one-size-fits-all deal.
Edit: If you're interested in more information, a decent starting place is this paper: https://eprint.iacr.org/2014/881.pdf If you folks want more reading material, I can pull some emails from the mailing list
To be clear, that isn't the final spec. AFAIK, we don't yet know the extent of changes that may be made to Argon2. The email says that Argon2 will be the "basis" for the final winner, which could have some interesting results.
I expect there will be a long review period before the winner is considered suitable for common use. It reminds me of the backlash at NIST for some late changes to the SHA-3 winner, Keccak: https://en.wikipedia.org/wiki/SHA-3#NIST_announcement_contro...
I don't expect there to be a similar community reaction. NIST and the PHC panelists (list at https://password-hashing.net/) are two very different bodies.
Argon2's internals will likely be altered (with community input); Keccak's internals were left alone (only parameters were tweaked).
The community wants modifications to Argon2 before its final publication. Essentially, each of the five supreme finalists possess nifty properties. It'd be nice if Argon2 could be modified to possibly have some of these too.
That being said, I do expect you're correct that the winner will not be in mainstream production for a couple years at least.
Impressive stuff. One of the features of the winner is that you can offload the expensive computation to a client and still maintain the security you would have if it were done on the server. This should hopefully persuade people to use slow hash functions where they otherwise would not due to performance concerns.
It also allows client-independent strengthening of the hashes, so you don't have to wait for users to login again to be able to increase the strength of all existing hashes.
GP might be thinking of cases where the Nth iteration of the hash is only based on the salt and the result of the N-1st iteration, rather than on the original passphrase.
I'm not aware of any currently recommended algorithm that does this, though. The original passphrase usually goes into each and every iteration, not just the first round.
Exactly, it's hacky, but Scrypt'ing your ancient MD5 databases is better than sitting on your ass and being caught with your pants down when your database gets dumped on pastebin or a Russian forum
Being able to have the client (which knows the password it offered) do this computation isn't something super special though-- it's something that could be done fine with pbkdf2-- for example.
If fancy client support is really an option it would usually be better to use a zero-knoweldge authentication protocol (like SRP), though one of these KDFs could be used as a preprocessing step.
> One of the features of the winner is that you can offload the expensive computation to a client and still maintain the security you would have if it were done on the server
Yeah, I've heard it called "server relief." Slow password hash is computed on the client, then transmitted to the server and run through a fast hash before being stored.
What tools are typically used to develop these algorithms? The site has https://password-hashing.net/faq.html#qd which mentions attempts to formally define the security of a good algorithm though a quick scan of the two papers indicates the definitions are mathematical properties described in English. However, when it comes to implementations is there a generally accepted language/framework in which correctness can be proven? Haskell comes to mind as one such language which its proponents tout as ensuring correctness, though I lack the experience to determine whether this means "your broken algorithm runs 100% correctly" vs. "a broken algorithm will not compile".
Typically, cryptographic primitives (like block ciphers, hash functions, KDFs, etc) are written in C. Bugs are caught solely by the use of a sharp mind and careful reading.
There are really two reasons for this: (1) optimization is a really big deal; some impls go so far as to have asm for specific architectures or use architecture-specific features like SSE2 and (2) there really isn't very much code for a cryptographic primitive, typically <=1000 lines. (e.g., see https://github.com/khovratovich/Argon2/tree/master/Argon2i/r...).
Once a simple reference implementation is hand-checked, it can be used to generate comprehensive test vectors for other implementations.
This isn't my area of expertise, but I believe there's some research in applied crypto for implementation verification. I know https://github.com/GaloisInc/cryptol is a language that's supposed to help in that regard, but I don't know offhand who all uses that.
The other reason for C is that the reference implementation is generally expected to eventually have bindings for every other language - a new standard won't flourish if you find out it's hard access to it in your choice of language.
If you can also write an implementation in something such as Haskell, verifying the input/output between the two implementations is a great way to assure it works as expected.
Low-level crypto stuff such as block cyphers and hashing functions is usually done in C or assembly language. One of the biggest reasons is that its very hard to avoid timing attacks on higher level languages (when a computation takes a different time to run depending on the inputs an attacker can extract some info about secrets you control). Performance is also really important because the bad guys will use whatever means necessary to speed up their brute-forcing and you don't want to give them too much of an edge.
Another reason for C and assembly is that when everything your algorithm manipulates is a bunch of bytes you don't get much advantage from a type system.
Anyway, as best as I could tell, the consensus on the public mailing list was that the best case for a singular winner would be an amalgamation of four finalists: Argon2, Lyra2, Catena, and yescrypt. Each of them has some properties that are desirable. I'm curious to what extent Argon2 will be modified - and especially curious if the final spec will have tunable parameters / multiple modes or be a one-size-fits-all deal.
Edit: If you're interested in more information, a decent starting place is this paper: https://eprint.iacr.org/2014/881.pdf If you folks want more reading material, I can pull some emails from the mailing list