Hacker News new | ask | show | jobs
by jart 1725 days ago
I've been porting Python to Cosmopolitan Libc so it can be an Actually Portable Executable and I managed to trim the entire UNICODE 13 and UNICODE 3.2.0 (b/c encodings.idna) databases down to 934kb.

    ~/cosmo$ bloat o//third_party/python/python.com.dbg | grep PyUnicode | grep -v '\b[bB]\b' | head
    000000000002e183 T _PyUnicode_Phrasebook
    0000000000028000 T _PyUnicode_CodeHash
    000000000001dc86 T _PyUnicode_Lexicon
    0000000000018780 T _PyUnicode_PhrasebookOffset2
    00000000000100a8 T _PyUnicode_LexiconOffset
    0000000000007d80 T _PyUnicode_Decomp
    0000000000006800 T _PyUnicode_DecompIndex2
    0000000000004dfc t _PyUnicode_RecordsIndex2_rodata
    0000000000004c68 t _PyUnicode_TypeRecordsIndex2_rodata
    0000000000004582 T _PyUnicode_ToNumeric
    000000000002e183 T _PyUnicode_Phrasebook
    ~/cosmo$ bloat o//third_party/python/python.com.dbg | grep PyUnicode | grep -v '\b[bB]\b' | awk1 | summy -x
    934,686
STB does a pretty good job at font rendering and it's less than 100kb. Fonts are pretty tiny too. Noto is under a meg if you just want western and emoji. Bloat is mainly an issue if you want to support China, Japan, and Korea who take up the lion's share of the UNICODE space, having at least 80,000 characters assigned to them. They also don't agree on how those characters should be rendered. So we need a separate copy of the font database for Japan, Korea, Hong Kong, China, and Taiwan. So you're actually looking at more than 40 megs. With Noto it's at least 66mb.
1 comments

> STB does a pretty good job at font rendering and it's less than 100kb.

I'll be honest, I'll have a hard time taking the rest of your post seriously if you think that STB font rendering is anything close to good. It was maybe not too far from the 1997 state of the art but the world has moved on quite a bit since then, it's downright terrible if you are not going for "retro aesthetic" font rendering.

And all that stuff is patented so I have a hard time taking you seriously, who would sharply criticize a free public domain font rendering library over things it can't control. Retro aesthetic is my use case since I use stb_truetype to render fonts in a terminal. STB is scrappy and may not be Apple or Microsoft but it's been a remarkable gift from my point of view.
Thankfully I live in a free country where we can show the finger to software patents. Sorry if that is not your case, but I'd say that the problem to solve there is social and political.

> Retro aesthetic is my use case since I use stb_truetype to render fonts in a terminal

Just because it's a terminal doesn't mean it doesn't deserve proper hinting and proper AA. I don't understand in which universe this can be called "a pretty good job", it's doing barely more than the legal minimum. But then it's maybe a cultural difference speaking here, in my country it feels very very weird to be upbeat about things that are barely ok like US people seem to be a lot of time.