Hacker News new | ask | show | jobs
by anandoza 2059 days ago
> I always thought that it would fail to decode the string since the probability that changeme is actually valid base64 encoding must be very low

I'm a bit confused, I thought any string with only lowercase letters was "valid base64" (more precisely, I thought "valid base64" is equivalent to "string consists only of the 64 special characters we're using to represent digits 0-63").

8 comments

Not any string, but if it is a multiple of 4 characters then yes it will always be valid. In particular 'changeme' has 8 characters and therefore represents exactly 6 bytes.

It gets more complicated if you're using base64 to represent a number of bytes that isn't a multiple of 3, those would be unlikely to happen randomly. Those will usually end in a number of '=' signs to pad their length to a multiple of 4 and indicate how many bytes are missing. Although apparently there are also versions o base64 that don't include the padding.

Depends on the decoder's default settings - some applications of base64 ignore padding but not all of them.

As an example, Elixir's stdlib includes a base64 decoder; passing "padding: false" will give the "decode anything" behavior you're describing.

https://hexdocs.pm/elixir/Base.html#decode64/2

No, because of padding rules. e.g. "ab" isn't valid base64, because it only encodes 12 bits, not a multiple of bytes. base64 is thus always with padding to a multiple of 3 input byte (equaling 3*8/6 = 4 output characters)
Right but they're still pretty confused, because the probability is 1/4, not "very low".
That does appear to be correct. I just did a bunch of base64decode("random-string-here") and always got an output, never an error.
Each base64 character represents 6 bits of data (which is why base64 is padded with ='s at the end).

Try running `base64 -d <<< a` and it will fail, but `base64 -d <<< aaaa` works (4*6 == 24 bits, which is divisible by 8 and gives 3 bytes of output)

I mean the whole point of base64 is to encode binary using only printable characters.
So "valid+base64" is valid base64.
I am equally confused for the exact same reason.