Hacker News new | ask | show | jobs
by petercooper 5401 days ago
I wondered how a naive approach would work in comparison: You can reliably represent just over 2^20 codepoints in a UTF-8 character or 2800 bits over 140 characters. Standard ASCII is 2^7. 2800/7 gives us a potential 400 ASCII characters using a naive approach alone or a compression of 2.86x compared to the 5x he mentions.