Hacker News new | ask | show | jobs
by BearsAreCool 2148 days ago
One thing that bothers me about most of these algorithms is how conceptually difficult they are for me to understand. I don't have that strong of a background in cryptography but there is something special about how with RSA you can do it at a small scale by hand. With very limited notes I can explain to a group of high schoolers almost exactly how this vital part of encryption functions and even vulnerabilities in it (small exponent values, etc). While rolling your own crypto is bad, with a bit of work I'd expect almost any programmer to be able to implement RSA without too much difficulty.

I'm not sure how important being able to explain encryption algorithms in detail to high school students is, but I'll definitely be sad when RSA and other non-quantum resistant algorithms join the likes of the enigma machine and exist for demonstration process only. I'm hopeful in time less crypto minded people (like me) will better understand module learning with errors or lattice based cryptography and maybe we'll get better at explaining it. It would be a shame if even more of cryptography into an unintelligible black box to eke out performance gains.

7 comments

RSA isn't that easy to understand. It's a trap: it seems simple on the surface because the core mathematical operation is easy, but the actual complexity (padding) often gets omitted.

RSA-KEM (key encapsulation mode) is easy. There's no padding, the only thing is you can't encrypt a message, only a random number < the modulus. Then you run that number through a Key Derivation Function (KDF) and use it with an Authenticated Encryption with Associate Data (AEAD) cipher. That's just simple enough I'd teach it to high-school students.

RSA encryption is hard. It requires OAEP padding, which is by no means simple. I'd not try to teach it to high-school students.

RSA signing is hard. It requires PSS padding, which is by no means simple. I'd also not try to teach it to high-school students.

Every other use of RSA is insecure, often in very subtle and hard to understand ways. Using RSA as a teaching tool about public-key cryptography does a disservice to students, since they (like you) tend to think that RSA alone is useful for cryptography.

And if you can treat an operation like the padding as a "black box" and ignore understanding, then you can understand code-based cryptography the same way: treat the code as a black box, and the math is very simple.

I sympathize but don't entirely agree.

The old version 1.5 signature padding remains widespread. Unlike PSS it doesn't have a security proof reducing to RSA but it does have decades of successful use in practice in one of the harshest environments (the Web because the clients merrily run code written by a potential adversary).

"Prepend this fixed data to your hash" which is the central idea of v1.5 padding is definitely something you could teach to high school students. You can even show them why it's necessary pretty easily for a small exponent.

Making people check padding is again not too hard for high school students, and I think "Do all the things on the checklist. All of them" is a worthwhile lesson not just in cryptography.

It is possible to implement 1.5 padding securely. But I'd not say it's easy, since I'm not aware of any implementation not written by an experienced cryptographer/cryptographic programmer that's gotten it right on the first try (no padding oracles or other vulnerabilities). It's conceptually simple, but still has plenty of footguns. The problem with RSA is all the footguns! Avoiding them reduces the conceptual simplicity, to the point where you might as well teach a slightly more complex scheme, like EdDSA or McEliece (with black-box code).
The most common mistake in implementations of RSA signature verification seems to be just plain not validating the padding at all. That does not require a "experienced cryptographer" it requires actually doing everything on the checklist, a reflex that's worthwhile in many pursuits.

Historically sometimes people would say "Well nobody would make an error like that in a modern elliptic curve scheme" and then Microsoft turns out to have shipped a very popular operating system named "Windows" which didn't do curve validation - so much for that belief.

Padding oracles aren't a thing for signature verification. You can work this out for yourself, everybody can do signature verification using the public key, so if there was a way to do it "badly" that somehow gives away the private key, you'd do it that way yourself and skip all the effort.

What does exist and might have confused you is an oracle for RSA PKCS#1 v1.5 decryption. In this case the party doing decryption knows the private key and so it makes sense that if this is done poorly they can leak vital information. It doesn't seem fair to call this a flaw in the signature scheme.

> Historically sometimes people would say "Well nobody would make an error like that in a modern elliptic curve scheme" and then Microsoft turns out to have shipped a very popular operating system named "Windows" which didn't do curve validation - so much for that belief.

I am pretty sure that CryptoAPI does not support (or at least did not at the time) any modern elliptic curve signature scheme, and by modern I am referring to thins like ed25519.

Not to mention the most difficult issue that even mainstream libraries (like libgcrypt which is used by gpg) get wrong: implementing rsa in constant time as to avoid timing side channel attacks. I would argue that understanding and implementing elliptic curves (the modern ones especically) correctly is much easier.
Do you have a reference to any practical side channel attack on, say, 2048 bit RSA using the current version of libgcrypt? If so it should be filed as a bug.
The current version of libgcrypt is not practically exploitable using the technique described in that 3 year old paper.
I remember that their changes <https://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=c... were still variable time and thus able to leak secrets. I presume that nobody bothered exploiting yet.

Anyway, even if this was fixed (which I doubt it) the point is that they had vulnerable code for 18 years.

You can intuitively reduce RSA to model problems that you can do by hand, or trivially implement it in Python, but it's debatable how much your intuition is actually serving you; do you really have your head around why RSA works, what its limitations are, and how it can be applied? I think you have to get that through careful study, not from the kind of "machine sympathy" you often acquire through implementing algorithms and systems code.

In reality, most programmers cannot implement RSA safely without much difficulty; the opposite thing is rather the case. There's a reason the term "schoolbook RSA" has the connotations it does.

But there still is value in "schoolbook RSA."

The OP thinks we lose something if we don't have "schoolbook $NextGenCrypto."

I would disagree, because we can still use "schoolbook RSA" for its primary purpose: education on the general concept. At the high-school, or even 101 uni level, either we are teaching basic, generalizable skills, or we are providing high-level surveys of general knowledge.

The use case isn't "How do I implement and use RSA as a practitioner in the field," but rather it is "How does this thing, that seems like magic, work?"

The risk is that the simplified model gives some kind of intuition that will provide a disservice later in life, but the benefits are just the opposite. Which is why showing how even small errors even in the toy-implementation can render it insecure is also a useful tool.

I dispute that; I think schoolbook RSA has probably done more harm than good. It's not showing you how this thing that seems like magic work; it's just replacing one magic trick with another.

It's not a hypothetical concern; you see it in every instance where someone has implemented schoolbook RSA in production, which was a not-infrequent occurrence when people were still using RSA in new designs.

Unlike Thomas I don't believe Textbook RSA is necessarily unhelpful for thinking about RSA. However I do think it's unhelpful for thinking about other public key algorithms. And even ignoring Post-Quantum Cryptography you probably want other algorithms than RSA in new systems.

What you will still see in brand new by-laymen for-laymen material today is an assumption that everything else is just like RSA but with trickier maths and that's nowhere close to true. For example, RSA can be used for signatures, and so can Ed448 using fancy elliptic curve maths, I know how to use RSA to encrypt this GUID†... so presumably now I can just encrypt the GUID with Ed448 instead of RSA? And the answer is just "No" because Ed448 is a signature scheme, it isn't for encrypting things.

This happens across all of IT, whether it is programmers bemused that git doesn't "lock" the remote repository when they check code out, or surprised that Strings aren't actually an array of Characters in a new language they're learning, or network engineers who still haven't seen CIDR notation... but in cryptography your mistakes from false over-generalization can blow up in other people's faces really badly.

† There's an excellent chance you were doing this wrong, but that's not the point here.

Because post quantum crypto algorithm are based on completely different breed of hard problems. Lattice problems and they are not straightforward to visualize as we were with RSA. Visualizing a lattice of large dimension, calculate its shortest vector or various other properties is not straightforward.

My bachelors was in Math and my master thesis was about lattice problems and post quantum cryptography. For my Ph.D. I switched to completely different subject because I couldn't write good papers in this area. So I understand your concern.

2d lattices are easy to visualize though. I've given brown bags to general engineers on NTRU like systems, and have found that some find it even easier to understand than RSA as it's more graphical
I am actually not sure if this isn't more of a problem of RSA than a feature.

I have seen many simple introductions to some cryptography that try to explain RSA. The problem is: They're almost always wrong and explain some insecure variation of RSA. (Most of the time they talk about a "Message" as the input to the RSA function without ever explaining the concept of padding and why this is important in RSA.)

The thing is: Implementing some form of RSA is relatively easy. Implementing RSA in a secure way is far from trivial.

I think it's likely that your understanding of RSA is mostly fake. E.g. You could do some totally insecure 'RSA' on paper, but it would be insecure.

Eventually people will write intuitive explanations of other algorithms (it's fairly easy to do them for sphincs+ and mceliece, for example) and you'll feel you understand them too. (and, in fact, I think those understandings will actually be less dangerous than the RSA ones!)

> with a bit of work I'd expect almost any programmer to be able to implement RSA without too much difficulty.

Implement RSA incorrectly without too much difficulty.

You should be dubious that even world class cryptographers will get it totally right, as many RSA implementations -- even ones written by experts-- have had serious flaws.

Would you consider elliptic curve cryptography conceptually difficult? It's certainly more complex than RSA, but over time great educational resources have been developed. The same will happen for these post-quantum schemes.
Personally, I don't think the ECC tutorials are very good-- they are mostly analogs of the RSA tutorials that are extremely mechanical and do not impart much to any useful intuition about the system.

Someone versed in the monthly HN "how ECC works" submissions will find themselves able to implement a completely naive sidechannel vulnerable and unusably slow implementation of pubkey generation and ECDH (and maybe ECDSA, if they cribbed from wikipedia while doing it). Yet they should on no account be doing this because there are already many excellent, well written (or at least battle hardened) implementations.

By contrast they are unable to:

1) recognize vulnerabilities, even trivial ones, in implementations (including their own)

2) Invent new useful applications (which would be the best reasons to not use an existing piece of code-- it doesn't do what you need)

3) Come up with or implement performance optimizations.

The reason I've found for that by focusing on a mechnical understanding of how to apply the group law gets in the way of an algebraic understanding of how the group can be used.

For all but (3) the specifics of the group law are irrelevant and even for (3) everything except for specialized cracking code uses projective coordinates so the procedure people learned for affine coordinates isn't useful.

If people are having fun they have my blessing :)... but it's my informed view that overwhelmingly ECC tutorials give a lot more "feeling of understanding" than useful understanding.

I second that. I've helped teach ECC to high schoolers (granted, it was a summer program for exceptionally mathematical students) and they found it quite approachable.
Lattice based cryptography can actually be made pretty accessible and can be taught to highschoolers and fresh undergraduates (I have done this with students at my job) as long as you give them a couple of math tools they may not have: matrix transforms, Gaussian distribution, vector dot product, and the one time pad. While this is a longish list, none of the topics are particularly hard and are accessible. I'll take a crack at an explanation (with schoolbook Regev-like LWE) and you can let me know how I did. It is better with pictures and probably can be shaped over the years, but I do think that people can adapt this explanation to turn it into something as easy, if not easier, than RSA.

1. Start with a random (secret) n-dimensional vector, s, that lives on a lattice (a square grid like pattern). This is your (small) secret key.

2. Hit that vector with a randomly generated matrix A (written As for multiplying matrix A with vector s), which basically just scales, skews, rotates, and projects that lattice into m-dimensional space (we want m to be much bigger than n). For those in the know, the new vector, As, is almost a "psuedorandom", "random looking", or hard to predict value. This is a way of taking your small key and turning it into a large key.

Intuition: Note that each component of As is basically an individual dot product. On a simple level, you can think of the dot product as a generalization of XOR, parity, or something that creates a checkerboard pattern using that lattice. This basically creates a high dimensional grid (lattice) with many colors that alternate. The end goal is to get some samples of these colors and reconstruct what the grid pattern looks like. If you have tried to learn a checkerboard (XOR, parity) pattern with a linear classifier, you know that this is hard. Here, we are generalizing this concept into something even harder.

3. Add (Discrete) Gaussian noise to As to get As + e. This is just adding "small" random noise to each of the components. You do this is to turn the almost "random looking" large key into an actually "random looking" large key.

Intuition: The reason for doing this is that if you have A and y where As = y, you can solve for s using Gaussian elimination. This is equivalent to the process that gradeschoolers use to solve systems of equations with multiple variables (where they subtract one row from another or solve then substitute variables).

Turns out that Gaussian elimination and other algorithms suck bad at getting the right answer if you have noise or small sources of error. The process gets thrown off wildly and gets no where close to s. The situation is so bad that even quantum computers are assumed to not be able to do it. This is what makes your large key resistant to being reverse engineered by powerful computers.

4. Use your As + e as a one time pad. To encrypt a message m, your ciphertext is the pair A and As + e + m. To decrypt, someone takes the key, s, computes As, then subtracts that off. This leaves you with m + e.

5. Decryption in (4) is bad because the small error, e, is leftover. This makes the process noisy and makes you lose a little information when you encrypt or decrypt. You fix this by using a noise correction encoding that makes the e go away.

Your new ciphertext is (A, As + e + encode(m))

EDIT: Note that to make this actually secure you need to pick good parameters for your lattices and your noise distributions. If your error is too small, it wont fool attacks and is easy to solve. If your error is too large you cant decrypt. The problem is easy unless you are in high enough dimensions (for instance, LWE is easy in 2 dimensions, which is what we use to visualize things).

Do you have any YouTube videos of your lectures? If not, consider making one and uploading it. As lattice based cryptography comes into maturity ELI15 for us non-mathematicians on the ground floor will be essential.
Watch YouTube videos of Dr. Peikert presenting his paper "lattice based cryptography for internet"

https://www.youtube.com/results?search_query=lattice+based+c...