Even if the Yubikey supported larger RSA keys (3072 or 4096-bit), you might not want to use them. Decryption is quite slow with a 4096-bit key on a smartcard (for example, it takes ~5 seconds to decrypt with a 4096-bit key using a g10code smartcard), which seriously hampers usability. Increasing RSA key size also has diminishing security returns [0], so in an ideal world you'd be using ECC instead, to get a strong 256-bit security level with good performance. Unfortunately, both the OpenPGP smartcard spec and smartcard hardware need to be updated to add support for ECC (and who knows how long that might take).
It's worth noting that, while many people refer to the NIST guidelines that say that RSA 2048-bit keys are acceptable until 2030, NSA's Suite B doesn't include RSA at all and only recommends ECC. The problem is that the state of the art in open source cryptography software and hardware has not caught up to NSA in the 10 years since Suite B was published.
If you really want to use your Yubikey, a reasonable compromise might be to generate a 4096-bit RSA master key (stored offline), and put 2048-bit subkeys on your Yubikey, which you would want to rotate periodically. This gives you the benefits of a strong key for establishing your identity in the Web of Trust and the better performance characteristics of 2048-bit keys on smartcards. The downside is a higher upfront setup cost, and periodic maintenance cost on both your part and on the part of the people you communicate with, since you will want to rotate your subkeys and they will need to refresh their copy of your key whenever you do that.
Given what we know from the Snowden documents, spy agencies seem to attack endpoint security often and don't seem to be able to crack commonly used encryption schemes, including RSA (although they can almost certainly factor 1024-bit keys, given current estimates for the costs of such an operation). Therefore, the improved endpoint security afforded by using a GPG smartcard might make using a smaller key size a worthwhile tradeoff.
On the other hand, GPG does not provide forward secrecy, so while it's important to rotate keys and delete old messages as much as possible as part of OPSEC for your local machine, if NSA is capturing your GPG messages from the wire for later decryption, you'd probably want to prefer using the strongest keys possible. This is an area where increased StartTLS deployment (especially with forward secure ciphersuites) can help.
Finally, it's important to note that when you argue in favor of 4096-bit keys over 2048-bits keys, you are saying "I believe my adversary can factor a 2048-bit RSA key, but not a 4096-bit RSA key". If your adversary can factor a 2048-bit RSA key, and they really want to read the email that's been encrypted with your 4096-bit key, I would argue there are a lot of options available to them, such as:
1. MITM a software update for your mail client or operating system. It depends on the specifics of your system, but many systems today use 2048-bit (or weaker) keys for authenticating software updates.
2. MITM any website you visit (most CA's have 2048-bit roots, so it doesn't even matter if all the sites you visit have 4096-bit keys) and use a browser exploit to deliver some malware that exfiltrates your email when you decrypt it.
The final part of your post is fundamentally flawed. You are saying that if one part of your trust chain is limited to a certain security level, then it makes no sense to make any part stronger. Following this practice would make security a lot worse. One example result of your reasoning is "As long as there are 1024-bit root CA's I don't encrypt anything because I could be MITM anyway".
> You are saying that if one part of your trust chain is limited to a certain security level, then it makes no sense to make any part stronger.
Security is only as strong as the weakest link. My argument is that if your adversary is powerful enough to factor an RSA-2048 bit key (but cannot factor an RSA-4096 bit key), then it is likely they would also be powerful enough to compromise your data via one of the "weaker links" that I described, rendering the stronger key worthless.
I am not saying that it makes no sense to make any part stronger, just reminding you that crypto is not magic security dust and bigger keys don't necessarily make you safer in the context of an exploitable endpoint environment.
The goal of this final part was to reinforce my argument that the security benefits of using a smartcard outweigh the benefits of using larger RSA keys, and so I am encouraging the use of a Yubikey as a GPG smartcard despite the limitation of only allowing up to 2048-bit keys. Hopefully they will support larger keys and/or ECC in the future and we can all switch to that when it is available.
It's worth noting that, while many people refer to the NIST guidelines that say that RSA 2048-bit keys are acceptable until 2030, NSA's Suite B doesn't include RSA at all and only recommends ECC. The problem is that the state of the art in open source cryptography software and hardware has not caught up to NSA in the 10 years since Suite B was published.
If you really want to use your Yubikey, a reasonable compromise might be to generate a 4096-bit RSA master key (stored offline), and put 2048-bit subkeys on your Yubikey, which you would want to rotate periodically. This gives you the benefits of a strong key for establishing your identity in the Web of Trust and the better performance characteristics of 2048-bit keys on smartcards. The downside is a higher upfront setup cost, and periodic maintenance cost on both your part and on the part of the people you communicate with, since you will want to rotate your subkeys and they will need to refresh their copy of your key whenever you do that.
Given what we know from the Snowden documents, spy agencies seem to attack endpoint security often and don't seem to be able to crack commonly used encryption schemes, including RSA (although they can almost certainly factor 1024-bit keys, given current estimates for the costs of such an operation). Therefore, the improved endpoint security afforded by using a GPG smartcard might make using a smaller key size a worthwhile tradeoff.
On the other hand, GPG does not provide forward secrecy, so while it's important to rotate keys and delete old messages as much as possible as part of OPSEC for your local machine, if NSA is capturing your GPG messages from the wire for later decryption, you'd probably want to prefer using the strongest keys possible. This is an area where increased StartTLS deployment (especially with forward secure ciphersuites) can help.
Finally, it's important to note that when you argue in favor of 4096-bit keys over 2048-bits keys, you are saying "I believe my adversary can factor a 2048-bit RSA key, but not a 4096-bit RSA key". If your adversary can factor a 2048-bit RSA key, and they really want to read the email that's been encrypted with your 4096-bit key, I would argue there are a lot of options available to them, such as:
1. MITM a software update for your mail client or operating system. It depends on the specifics of your system, but many systems today use 2048-bit (or weaker) keys for authenticating software updates.
2. MITM any website you visit (most CA's have 2048-bit roots, so it doesn't even matter if all the sites you visit have 4096-bit keys) and use a browser exploit to deliver some malware that exfiltrates your email when you decrypt it.
[0]: https://www.yubico.com/2015/02/comparing-asymmetric-encrypti...