Hacker News new | ask | show | jobs
by u801e 2497 days ago
> The email ecosystem still didn't evolve into a format where we can safely transit binary data around.

The only significant characters in email are carriage return, line feed, and period. There's also a line length limit in the SMTP protocol specification. Other than that, bytes sent during the DATA phase are sent unaltered.

Base64 encoding is meant to address this, but it results in a 33% overhead in attachment size. On the usenet side, people came up with an encoding scheme called yenc that actually only escapes those characters mentioned above and only as a 2 to 3% overhead over the original file size.

1 comments

If you are talking about SMTP, servers may fail on receiving any character larger than 127, unless the client started the session with EHLO and the server announced either the 8BITMIME or the BINARYMIME extensions, where the first one only allows valid UTF8 stings, and the second requires a completely different mechanism that does not use DATA.