Electrical noise (including RF noise) is really random, as in it is impossible to predict exact value.
It does have non-flat spectrum, meaning some values are more probable than others, but that only means you need to whiten it. (A rough analogy might be a 6-sided die labeled with 1,1,1,2,3,4 - yes, number 1 is much more likely to come out. No, this does not make it "not really random", and some trivial math can produce ideal random stream out of it)
The only problem with audio input is that you may end up with non-random value - like all-zero output. But properly implemented whitener should detect this and stop outputting any value at all.
it's an often-made mistake where random generation / randomness is confused with probability distribution. Having said that, I don't know (as is in really don't know) if RF noise is unbiased; doesn't sound like it?
If you are talking about DC bias (as in, long term average of raw readings), then "unconnected audio input" is pretty likely to have it - it's easy to introduce via component tolerances, and there is no real reason to keep it exactly zero for audio purposes. But it's also pretty trivial to fix in software.
If you are talking bias in more general sense, then audio input noise is non-uniform in the frequency space, for example there is low-pass filter which filters out high input frequency, and it will affect noise values too. Good whitening algorithm is essential.
The good news however is there are many noise sources which are actually caused by quantum effects in electronic parts, and therefore completely unpredictable. Even if NSA recorded all RF noise, they still could not predict what the ADC will capture. (But they might be able to capture digital bits as they travel over the bus...)
If we were dealing with pure cosmic background radiation, or inside a Faraday cage, sure.
What I'm referring to are things like radio broadcasts, 60 Hz hum from power lines, noise put out by switching power supplies, and that sort of thing.
Just having a bias, as in your example, would be still truly random. If you knew that every tenth roll you'd get a 3, it would no longer be random. When your random number generator can be influenced by the outside world, it's no longer suitable for cryptographic use.
> That will give you RF noise, which isn't really random.
what does "really" random even mean in this context? does it actually matter?
given 3 hypothetical devices in a homelab:
a) does no specialized hardware entropy collection, and instead relies entirely on the standard Linux kernel mechanisms
b) does entropy collection based on the RF noise that you're saying isn't "really" random
c) does entropy collection based on whatever mechanism you have in mind that generates "real" randomness (hand-carving bits of entropy out of quantum foam, or whatever)
even if your threat model includes "the NSA tries to break into my homelab"...device A will almost certainly be fine, they'll have ways of getting access that are much simpler than compromising the entropy pool.
I suppose device B has a theoretical vulnerability that if the NSA had physical access to your homelab, they could monitor the RF environment, and then use that to predict what its inputs to the entropy pool were. but...that's assuming they have physical access, and can plant arbitrary equipment of their own design. at that point, they don't need to care about your entropy pool, you're already compromised.
No one puts raw bit source directly into private key, they always whiten it via some method (often entropy pool setup using strong hash/encryption functions).
That means that even if you "random inputs" are totally predictable, the random values which come out of whitener are completely distinct, and generated RSA keys have virtually zero chances of being similar.
That's arguing that you can just seen an RNG with the current time and use a PRNG as your randomness source - a whitener can't give you randomness out which isn't there to start with.
In the above, fairly extreme case, the risk should be obvious: if someone has a decent guess on what the uptime of your system is, and knows you're doing this, then the search space to crack certificates can be made accessibly small.
Like if you know see a certificate with a Valid From date of say, January 1, 2025 but you know the service definitely wasn't running on January 1, 2024, then by guessing what the PRNG is you've constrained your search space to 1704027600 through 1735650000. So the issue isn't whether the numbers you emit are distinct - it's that an adversary can make it suitably likely that they can produce colliding RSA keys themselves anyway (and remember, they get unlimited attempts at this - they only have to succeed once).
EDIT: And while you can certainly argue that they couldn't predict the exact noise environment of say, your server room, it's also not impossible to model which also might constrain the search space enough to accessible. It's not "haha! we know your every move" it's just making the problem space small enough to brute force.
It does have non-flat spectrum, meaning some values are more probable than others, but that only means you need to whiten it. (A rough analogy might be a 6-sided die labeled with 1,1,1,2,3,4 - yes, number 1 is much more likely to come out. No, this does not make it "not really random", and some trivial math can produce ideal random stream out of it)
The only problem with audio input is that you may end up with non-random value - like all-zero output. But properly implemented whitener should detect this and stop outputting any value at all.